League General Discussion Thread

Stucifer

Wanted to double check a rule here for BM/PtV:

In OOB you can attack USSR with Italy and NCM German units in on their turn giving you the NO for Trade still, but being in original USSR territory. Is that legal for League play?

MrRoboto

Yes it is

gamerman01

Boom a thought just hit me.

If anyone is wondering about “grade inflation” where we have a lot more E’s than 2’s and 3’s this year, I think I’ve suddenly realized a major factor.

Rather than starting everyone out at 0 each year, the ranking system carries over everyone’s reputation from the prior year, everyone who finished 3+ games the prior year.

So we start the year with a lot of 1’s, E’s, M’s, and those players are much less likely to leave the league because they’re crushing it. The weaker players lose heart and go back to beating their neighbor kid next door!

It’s all relative anyway, so it’s still good. (e.g. a 4.50 is definitely significantly higher than a 4.00) But it did bother me that we haven’t had a central distribution surrounding 4.00, middle of tier 1, and I’m pretty sure (without even thinking for 10 minutes) that that is why.

I post it here so you guys can kick it around a bit and give thoughts.

MrRoboto

This probably explains the inflation of points.
However, this is the least of the flaws the current system has. While I do appreciate that the current system is a vast improvement over what was in place before (a simple Win%, nothing else), I still have numerous issues with it.

No definite points after a result.
Imagine the following scenario:
Player A (Tier 1) wins against Player B (Tier 1).
You could score Player B first. He lost against a Tier 1, so receives 2 points, is at 2.0 PPG and is therefore dropped to Tier 3. Player A now receives 4 points for winning against Tier 3, is at 4.0 PPG and stays Tier 1.
Or you could score Player A first. He won against a Tier 1, so receives 6 points, is at 6.0 PPG and climbs to Tier M. Player B now lost against a Tier M so receives 4 points, is at 4.0 PPG and stays at Tier 1.
If you score winners first, you have points inflation.
End of year standing actually reflects yearly average
Your end of year PPG is actually a yearly average and not a reflection of your skill at the end of the year. Have you improved over the course of the year? Hard to tell, your skill might be Tier M at the end, but you still score at Tier 1.
Not predictable
If I play against a certain player, I don’t know how much that game is worth until the 31st of December. My opponent can climb or fall Tier until the end of year and will retroactively change my PPG. That will even happen without my interaction with said player! I might stop play in June and my PPG might severely change until the end of year, which is very unintuitive. And I don’t think it makes a lot of sense.
Discourages playing weaker opponents
Everyone who has played Tier 3 or Tier 2 has lowered the PPG, even with a win! We are talking about 50+ games this year alone.
As long as your PPG is higher than 4.0 (This is only Tier1, the MIDDLE of the pack, not elite players!), EVERY win against Tier 3 players is hurting you!
As long as your PPG is higher than 5.0 (Still not the highest Tier, only Tier E at this point), even winning against Tier 2 is detrimental.
And when your PPG is higher than 6.0 (All Tier M, but even some Tier E players!), you shouldn’t even play against the middle Tier, Tier 1. Even a win would hurt your PPG.
Losing can help
Every Tier 3, Every Tier 2 and even half of Tier 1 can easily improve my just losing to a Tier M.
Sorry, but this makes no sense
Results against new players depend on a single moderator
Currently, new players are not ranked consistently. As an example: @jkeller r is 0-1 but placed at Tier M. So @AndrewAAGAmer received 8 points for that win.
On the other hand, @Gorshak is 2-0 against two top players, @666 and @GeneralDisarray but still placed at Tier 1. So their losses against him gave both of them only 2 points, DRASTICALLY lowering their PPG.
Circular referencing
A players Tier affects the points opponents get. These points affect the opponents PPG and thus the tier of the opponents. This Tier affects the points the original player gets and thus the Tier. But this Tier originally affected the points the opponents get.
This goes back to issue 1, but the problem is more widespread and generalized than the example in 1).
You can reach dozens of different rankings of every single player here, with the exact same game results, depending on the order of calculations and the order of reporting.

There are a few minor things that theoretically could also get fixed, but those 7 points are major flaws with a PPG based system.

I have created an improvement, which is ELO based (a system used in games like Chess, World of Warcraft or League of Legends). That fixes all of the issues above and the spreadsheet is already finished too, fully automatic even.
I’m just waiting for feedback from @gamerman01 before I share it with all of you

Ghostglider

Moving towards an ELO based system sounds like an improvement to me.

Martin

@MrRoboto thank you for this analysis, and well presented! The ELO system was suggested a few times during the last years, and I also strongly support it. And as I stated before, there are plenty of management systems available which would ease the job of the score keeper / league manager.

AndrewAAGamer

I played at Days of Infamy and they used an ELO system. While I do not understand the finer points as you obviously do the best thing about it was you could play anyone; the points earned or lost reflected that. As a top player I would only gain a minimum of points, and they would lose a minimum of points, for beating a bottom player and yet if they could score a win they gained a ton of points and I would lose a ton of points. Therefore, there was no discouragement to playing anyone as there is here.

Plus, it sounds like it would greatly alleviate the work load for the League Moderator which is a very good thing.

jkeller

I don’t really appreciate being used as a counter example here. I decided to take a break this year, but if you look at my results from last year you would see that I clearly belong in the M tier or you can ask any of my opponents. AAGamer absolutely deserves every one of those points.

As to the entire argument I have no issue with an elo rating system in general, as I am a lifelong avid chess player (Master there too fwiw) and it is generally pretty accurate. I would say that due to the length of time it takes to finish one of these games, and the resulting low sample size the ELO rating would be far less accurate because it does rely on a larger number of games to be accurate. In other words AA players just have too few games elo rating to be accurate.

MrRoboto

@jkeller I’m sure your results in the past warrant you being placed in Tier M and it probably reflects your skill too.

The point still stands, though. Different players are handled differently because of the judgement of one single person. Now that moderator is luckily extremely fair, very benevolent and we can rely on him, but I think a system shouldn’t be dependend on having @gamerman01 around ;-)

But this raises an interesting issue, that’s absolutely up for debate: Should past results have any kind of influence on the current year? If yes, how big should that impact be?

Generally, there are two ways to think about it in an ELO system.

No influence at all. New year, Clean slate.
This happens in most major sports: Basketball, Football, American Football, you name it. A new season starts and every past result is wiped clean, every team / player starts anew with the same clean sheet.
Retain some ELO rating from the past
When the new season / year starts, the system could check the difference between the final ELO rating and the starting rating and then only move the new rating a certain percentage to the starting rating.
For example: Starting Rating is 1500 for everyone. Player A finishes a year with 1200 and player B with 1900 Rating.
We could say that instead of starting the new year at 1500 for everyone, we keep 40%.
So A keeps 40% of -300 (1200-1500) and starts the year with 1380.
So B keeps 40% of 400 (1900-1500) and starts the year with 1660.

Of course we could also use Option 2) but only for players above 1500, so that the lower skilled players have a new chance to start at the beginning, without baggage from last year.

I’m interested to hear the opinions of the community. Personally, I tend to lean towards option 2), because as @jkeller rightfully pointed out, we don’t have that many games here so this would balance this out a bit.

MrRoboto

By the way, this approach is used by many competitive online games.

Since climbing the ladder is time consuming and exhausting, players usually don’t fall back to square one when a new season starts, but fall back to certain safe thresholds for example. League of Legends or Hearthstone use similar systems for example.

jkeller

In chess, which probably is more similar to this as it’s an individual sport without much change from year to year( mental sport with no roster changes, coaching changes , injuries or much age based decline) you absolutely base this years initial rating on last years final rating. Why not? We need as much data as we can especially in this situation with extremely low sample size.

MrRoboto

Well if I’m not mistaken, chess sites like chess.com don’t even use seasons or yearly ratings at all. Isn’t it basically just a lifelong Rating, that you work on as long and as much as you are willing to?

Because in the end, for the reasons you stated, there would be no big reason to reset ratings on Jan 1st at all.
Do we want that? We could…

Actually, the more I think about it, the more inclined I am to agree with that sentiment.

Martin

Just brainstorming: could it make sense to have the curent rating always be based on the games finished during the last 365 days (rolling)? At least tennis players lose their tournament points after one year.

MrRoboto

Sure, everything is possible and I already know how to implement it.

Can you explain the upsides for that, @Martin ? I don’t see a big improvement over a rolling 365-rating instead of a calendar year rating.

I’d rather have some kind of decay for inactivity

Martin

@MrRoboto this was just an alternative to the “life long rating” which you mentioned.

Stucifer

I like the idea of a lifelong rating, as you mention about chess.com. I think we could use all the games we can get to influence ELO if league switches to that method.

farmboy

I’m not opposed to trying an ELO (or another) system either but would worry that the small sample size is going to mess it up more substantially. I do think past results should matter each new year (As far as I’m aware, in the current scoring, they just matter until someone has played 3 games and then their score is based on the outcome of the games played in the current year. So their effect is to increase the score of the opponent in the early games). And I also think we need a reset each year to ensure some variation in who gets to play in the league finals. A longer term ELO with the relatively small number of games played may make it harder for new (or much improved) players to break in.

My view is that if it isn’t (that) broke don’t fix it. I don’t doubt there are some issues with scoring. But we are a small amateur (and very niche) league with a small number of games played and with variation in the number of games played by player. Nothing is going to be perfect. And I don’t think changes in scoring will dramatically shift how the final standings look. Of course if we can use a method that is more streamlined, easier to use (not just for gamerman but for who comes after), I’m not opposed to that either. But the existing method of scoring doesn’t strike me as particularly onerous either.

And if we want to avoid having too many players ranked E or M one way to address it is to just to shift the thresholds where one crosses over. E could be set at 5 and M at 6 or higher for example and that would spread things out more.

AndrewAAGamer

If I recall correctly, at Days of Infamy, it was a lifelong record.

MrRoboto

I genuinely appreciate your feedback, @farmboy !
Always good to hear some different opinions.

Your idea of a fix would help combat the points inflation, for sure.
But as I said, that is the smallest of the issues. It would not fix any of the 7 major problems the current system has.
Now how broken is the system? I guess that’s subjective. It IS working, for sure. It creates a somewhat realistic ranking and most of the times the lower PPG player actually loses against the higher PPG player.
So in general, the higher the PPG, the better the player.

But in my opinion, that is a very low bar. If it didn’t fulfill this basic requirement, it wouldn’t be a working system at all. Our demand on a system should be higher than that, even if we are only an amateur league. We can still strive to be as professional as possible.

@farmboy said in League General Discussion Thread:

And I don’t think changes in scoring will dramatically shift how the final standings look.

This is the current ranking in the official spreadsheet.

This is the ranking in my automated spreadsheet:

But this is another ranking with the exact same rules. No formulas changed, no other entries, everything is the same. Just the order of calculations shifted.

Notice how gamerman went from rank 4 to rank 13!

Or another ranking. Again: same rules, same results, same formulas.

Now I don’t know about you but if a system can produce multiple different rankings depending on HOW you apply the rules, I personally would consider that system broken. We don’t even know who #1 is right now…

You did raise 2 concerns with an ELO system however and I want to address those:

The small number of games is offset in my system: The first few games have a bigger impact on ELO change. This gradually diminishes with the number of games until it settles at 10 completed games (exact number up for debate).
It would be completely negated by the way, if we use a lifelong ELO rating.

The other concern is how difficult it is for new players to enter the top ranks.
This is actually done super fast. If you win a handful of games against current top players, you will climb the ELO extremely fast and can reach the top spots.
I just tested it with my ELO system. A new player could claim the #1 rank in my ELO system after going 4-0 with 2 wins against top player Adam and another 2 wins against GeneralDisarray and ArthurBomberHarris.

That being said: How we choose the participants for yearly playoffs is another matter. We could only count the results of the calendar year. We could (and should) require a certain number of games played this year.
Right now the participants are largely the same group of players too. This would actually even change with ELO, for the better!

farmboy

@MrRoboto thanks for the response and explanation.

One point to clarify. I’m not worried that an ELO will prevent people from entering the playoffs. But I am worried that trying to score over the longer term will. As long as the scoring for the playoffs is primarily determined by one’s play in a given year, than it should be fine with or without an ELO.

I certainly don’t have a great understanding of the math that goes into this, but if the issue is that the rules will produce different results depending on the order of the calculations than can one solution be simply being consistent in the order of the calculations?

I do think given the small and variable number of games most of us play (and that we don’t play everyone else), we are always going to find that the rankings won’t quite match up with reality, but it has always seemed to me that we are pretty close. And that is good enough for me. But I’m certainly open to trying alternatives.

League General Discussion Thread

Featured Topics

T-shirts, Hats, and More

Suggested Topics

L24 OOB Myygames (X) vs GovZ (A+42) Game II

PTV Daaras (x+12) vs General Disarray

MikawaGunichi (X+12) vs ArtofWar Game #3

pacifiersboard (X +12) -vs- mikawagunichi (L) PtV

L24 PtV Stucifer (X+10) vs Adam514 (L)

L24 BM4 Surfer (Allies+23) vs Avner (Axis)

L24 BM Stucifer (X) vs Surfer (L+20)

pacifiersboard (X) -vs- surfer (L +21) BM

254

17.3k

39.9k

1.7m