The new ELO-based ranking system

  • '19 '18

    Dear community,

    there is a discussion going on right now concerning revamping / overhauling the current ranking system.

    While the current system is doing okay and we could certainly just leave it as is, there are some issues with it. How severe are they? I guess that’s up to personal opinion. For me they are major flaws.

    I have listed these in another post, over in the league general discussion thread:
    https://www.axisandallies.org/forums/post/1664594

    Please also have a look at this post:
    https://www.axisandallies.org/forums/post/1664652

    Now I have been working these past few weeks on an improved system and I’d like to introduce it to you.

    The spreadsheet is here. I hope it’s easy to understand, but explanation will follow.

    https://docs.google.com/spreadsheets/d/e/2PACX-1vTfxxzz_KifSSJTAXSIAUUXGiVEFnzOakleCGT7ho8YpjsqaOqpPswl7LxXhe8vRDHBC0lwElQusU3q/pubhtml

    Short and simple version for everyone:

    Every player starts at a base rating of 1500.
    Wins award points while losses give negative points.
    Winning against higher ranked opponents give more points than winning against lower ranked ones.
    Accordingly, losses against low ranked opponents reduce ELO rating more thank losing against better players.

    That’s basically all you need to know: The higher your ELO rating, the better you are!

    Tier levels are included but just serve as a visual cue or a motivation. They don’t have any impact on the ranking.

    For everyone interested: Now some more details:

    Whenever a game is posted, the Ratings of both Players A and B will change.

    The exact formula is:

    b5a09a70-130e-4c02-befb-6deea67ce31b-image.png

    RAnew = Rating of Player A after the game
    RAold = Rating of Player A before the game
    K = Factor that increases / decreases the points - more on that later
    S = 1 if Player A has won, 0 if Player A has lost
    EA = Expected outcome for Player A

    What is EA?

    aa6ea4d2-5f43-4c1e-b4f5-3e30dbc499f2-image.png

    Now this might look complicated to some. But what this formula does is easy:
    The higher the difference between Ratings of Player A and B before the game, the smaller the absolute value of Ea. Therefore a win against much lower ranked players is not worth a lot, while winning against similar or even higher ranked players is worth a lot more.
    For losses the opposite is true.
    A win by the current #1 against the current last place will award only meager 4 points for the winner and -2 for the loser.
    However, the last player would receive a whopping 136 for a win and #1 would suffer -87 for that loss!

    Now the Factor F is important: I set it to 500.
    This means, that a player with an Elo rating 500 higher than the opponent is 10x as likely to win the game.

    Increasing this factor would lead to a wider field of ELO Ratings, while lowering the factor would squeeze everyone closer together.
    I found 500 to be quite suitable for our needs but this might change in the future with more games coming in.

    Now the other important thing: The K factor.
    This factor is quite high in the first couple of games and then diminishes gradually. This is so a player can rapidly find her or his correct place in the ranking system. A very strong player would not need to play dozens of games to climb to the top - the system would realize the strength very quickly and move that player to the top in just a few games. Same of course for not-so-good players.
    The exact numbers are up for debate, but I have for now settled on these values:

    6f306cb0-3d68-4f94-a567-f2185ba801ca-image.png

    We can change these numbers if people think the impact of the first couple of games is too high or too low.

    .
    .
    .

    Advantages of this system:

    Besides solving the issues I mentioned in my other post, there are the following upsides:

    1. Transparency
      Everybody can always see the amount of points a result gave at any time.

    2. Climbing is always possible
      ELO is not set in stone. Climbing or falling can always be done: The more drastic the change in skill, the faster the ELO change will be.
      PPG on the other hand was getting more stable with each more game finished.

    3. ELO reflects the CURRENT strength
      You can see how strong every player RIGHT NOW is and not how strong the year on average was.

    4. No games are discouraged
      No strategic avoiding of games / players anymore!

    5. No theoretical end, improvement is always possible
      PPG always has a maximum. You can reach that maximum with your first game! You might need to complete the necessary amount of games to qualify for playoffs but if you go 3-0 against highest Tier, there is no way to get higher than that.
      ELO can always be improved, there is no limit!

    6. Filtering
      You can filter the results quickly. If you want to know how a specific player is doing with the Axis, you can find out!

    .
    .
    .
    Now this is a work in progress. I will gladly take your feedback and discuss it with the community. The goal is to find a system that the majority is happy with!

    Some things are TODO at the moment and I hope I will get to it asap.

    1. ELO Decay
      I am planning to implement a decay, if someone is inactive. My idea is that only ratings above the starting Rating of 1500 will decay. The decay could start after 6 months inactivity and then the ELO could drop by 10% per month until it reaches 1500. Please give some feedback!

    2. Two different rankings
      I am in favour of a lifetime ranking. However, most of us love the yearly playoffs and we need some kind of requirements to qualify for them. So my idea is to have two columns: One for the lifetime ELO and one that only uses the results of the current year. That way we can see overall ranking and yearly ranking at the same time. We could keep the requirement of 3 completed games in the current year to qualify for playoffs.

    3. Factor in bids?
      One idea came up by @mr_stucifer to factor in the bids into the results.
      For example: For every point the bid is above the average bid, the ELO-change could be 5% bigger than usual (and vice versa). I would need to work out the exact numbers. But I’m not sure if that’s even desirable in the first place? Your feedback is appreciated!

  • MrRobotoM MrRoboto referenced this topic on
  • 2024 2023 '22 '21

    @MrRoboto after a first quick (!) screening, I like this system very much. I just do not agree with this drastic decay if 10% per month starting six months of absence.

    This would mean a drop from 2600 to 1400 in one year. I do not think that Fisher or Kasparov would have liked to fight their way up starting an ELO of 1500 again. 10% decay per year seems reasonable to me.

  • 2024 2023

    @MrRoboto Thanks for the detailed post, I don’t have much input at this time but love the presentation!

    as @Martin pointed out, an annual or lower per month might be a better spot for decay if implemented. 3% would take someone at 2400 ELO down to a 2126 after a full year of inactivity, probably okay. 5% would be 1765 which feels quite low for taking a break. I know the floor would be 1500. Maybe 3% or at least no more than 5% in my opinion :)


  • Hi guys,

    I am very excited with what MrRoboto has cranked out - he is very high energy and doing a lot of work to improve the league.

    The final product of this effort is going to be awesome but it’s not going to be rolled out until I’m satisfied and also the majority of league players are happy. Or in other words, until we don’t have many dissenters. No one wants a change that a lot of people don’t like.

    I’m only going to post publicly right now, that I have major concerns about an ELO system because I think you can only rise or fall a certain number of points per game, and few games are played by most players because a game of G40 takes a whole lot of hours as we all know.

    It’s not chess, and it’s not a sport league where everyone plays the same number of games. This is what jkeller and farmboy have chimed in to say, also. That ELO doesn’t work very well when few, or varied numbers of games are played.

    I am writing privately with MrRoboto a lot, in order to make sure I understand this beautiful ranking spreadsheet, before saying much more. We will work together and keep communicating to you guys so you are kept in the loop. There’s got to be a way to level things out for vast differences in number of games played.


  • I really like the idea of decay, especially because it reflects reality. Unless a player continues playing G40 somewhere else, and the same version of it, they are going to start getting some rust over time.

  • 2024 2023 '22 '21 '20

    Personally, I don’t like the idea of the lifetime tracking to have any decay in it. If a person needs some time off why should their lifetime ranking be affected? Or, even if they stopped playing entirely, why wouldn’t we want their history to stay unaffected so we could go - “wow, look how good xxx was when he/she played here.”

    I am all for a separate yearly ranking that would determine playoff position and of course affect the lifetime ranking.

    The best thing I like about his system, as a high ranking player, is there is no detriment to playing anyone from any tier. Yes, there is a gigantic risk to your score if you lose, but there is no penalty to your score, just because you played somebody in a lower tier. I think everyone being able to play everyone will be good for the community.

  • '19 '18

    The concerns regarding the low amount of games played are all valid. I do agree with them actually!
    That’s why I included the “K”-Factor in the first place. And I toyed around with the numbers, had them a lot higher for the first couple of games too. But this has negative effects too… If you are unlucky in one of your earlierst games, it feels frustrating to need 3 wins to offset an unlucky early loss.

    All the more reason for lifetime rating ;-)
    I will try to add in more results from before 2023 asap. You will see that the ratings are a lot more stable then and I’m confident that this will satisfy the concerns.

  • '19 '18

    I was being unclear with what I meant with 10% decay.
    I meant 10% of the difference between current ELO and 1500.

    So a 2000 player would lose 10%, aka 50. And then he is 1950, so the next 10% would bring him down to 1905 and then 1865…

    But yeah, I agree - that’s still too harsh.

    I’m inclined to go with AndrewAAGamer…
    No decay for the overall lifetime ranking.

    And for the yearly playoffs we only count results from the current year anyway, so decay is not really necessary in the first place.


  • @MrRoboto thanks for all the work on this.

    One concern. I like the idea of capturing a players’ current strength. But because we play a small number of games I would worry here that two or three games would have too much weight. So if a player that was tier 1 (with 5 games played) ends up beating 3 tier M players in a row, that is pretty good evidence that they are at the same level and they are going to be a top player according to either the existing scoring or the ELO. But if a tier M player (with 5 games played) loses 3 games in a row to other tier M players, with the current scoring they that doesn’t necessarily mean that they are no longer tier M and I am worried with an ELO, the drop might be more dramatic. So for that reason I like that the current league scoring that averages over a year.

    One suggestion. This might be onerous, but could you score a year of play so that we could see how this ends up? That would both give us a sense of how well this works and also what might need to be tweaked.

  • '19 '18

    I absolutely can and will add that, farmboy.
    I will also add an indicator, when a player is inactive (should 6 months of not playing be considered inactive or rather 12 months?)

    But in my opinion, if a player loses 3 games in a row against other Tier M players, then that player shouldn’t be considered M for the moment. He would need to prove himself again with wins to reclaim M tier…

  • '19 '18

    To everybody: You can help me collect past results!

    It’s 3 am here and I have 3 kids, including a 3months old baby. It’s past time I go to bed haha.

    Go to the Sheet that’s named “Results before 2023” and enter the data.
    There are only 6 cells required and I named them so I think it’s self-explanatory

    The last entry I made myself is on Page 109 of the Post results thread, I was working myself backwards…

  • 2024 2023

    @MrRoboto I can help with this shortly


  • @MrRoboto so I just opened the spreadsheets. That is very impressive. I won’t have time to help now but didn’t realize you had already done so much work.

  • 2024 2023

    @MrRoboto sent you a PM as well, look for an email with a link to my spreadsheet for copying data over. I rearranged the columns on my end for faster entry the way brain reads the posts.

    Got back to early november 2021 (page 93 fully completed)


  • Nice, a group project to get more years recorded. After personally and manually entering every game result for years, I have quite a strong grasp on how good each player is, in a way that can’t really be quantified with bare numbers or a formula.

    So I am hoping to see how the numbers will fall with 1/1/22 to date, 1/1/21 to date, or however far we can go back. To the beginning of G40 is the dream. Shouldn’t matter much that the rule sets changed dramatically (especially balanced mod) at some points, since was even competition between many of the same players anyway.

    I hope the lifetime ratings will pretty much line up with my experience and memory, and it will be fascinating to see players from years ago stack up in the same rankings against contemporaries.

  • 2024 2023 '22 '15 '11 '10 Official Q&A Moderator

    So with a lifetime… no K factor adjustment for sensitivity, right?

  • '19 '18

    I would keep the K factor. It will help new players joining the community finding their correct spot in the ranking faster.

    If someone really weak comes, we need that player to fall quickly otherwise the first couple of wins are overrated.
    And if the next I don’t know, Napoleon or Sun Tzu, suddenly joins, we need that player to climb super fast otherwise the losses will hurt the respective players more than they should.


  • @MrRoboto if the k factor is considered over lifetime (and is too sensitive) an issue might be that players that are strong now, but weaker in the past will have their past games weigh down their current ELO. If I’m right on that, instead of it being a factor in one’s first games, can it be more sensitive in one’s most recent games? That might allow new players to move up more quickly without penalizing players that have been around for a while.


  • @MrRoboto

    mega! At first I thought like “may it be fun to play the league games whatever the ranking (system)” - but at this moment I find the project even more thrilling than going on with my games (:) mainly because you gently propose it as a matter of community! And by this you are doing great in keeping @gamerman01 's style!! It looks to me as what you @gamerman01 have fostered dearly is coming of age rather than plotting


  • @MrRoboto said in Proposal for a new, ELO-based, ranking system:

    I would keep the K factor. It will help new players joining the community finding their correct spot in the ranking faster.

    Ah, of course this is right.

Suggested Topics

  • 40
  • 19
  • 58
  • 90
  • 60
  • 13
  • 52
  • 72
Axis & Allies Boardgaming Custom Painted Miniatures

114

Online

17.2k

Users

39.6k

Topics

1.7m

Posts