The new ELO-based ranking system

pacifiersboard

thx! I meant a copy of the ELO file that is enabling some (local) experimentation

Stucifer

The fact is, with lifetime ELO, some players start the year out ahead of others, and the ELO at the end of the year is largely reflective of most recent games, but not entirely. This is not necessarily a problem. It’s just one of the issues that needs to be discussed since we have a significant system change.

One idea: make a separate sheet for the Yearly rankings that takes the player’s Lifetime Ranking at the start of the year as their starting ELO, then only include the games from the year. Not sure how to calibrate the K value for that, maybe have all at the middle number for 10+ games played (90). Can be discussed.

That way, Lifetime players with high ELO are still recognized for their skill but someone that makes a lot of progress in a new year can move up more rapidly even if they played a lot of games previously. And someone that loses a lot of games that year would move down more rapidly.

However to @MrRoboto 's point, the K value is already large with the K at 70 giving a 35-point swing for even ELO matchups, and upsets quite a bit more than that, it seems like just using the Lifetime rating should suffice for making brackets. 🤔🤔

Stucifer

@mr_stucifer said in Proposal for a new, ELO-based, ranking system:

middle number for 10+ games played (90)

Correction, 7-9 games I think is K of 90, somewhere around 90-110 would be a significant acceleration of change compared to 70.

gamerman01

OK, so there is a “slider bar” on sensitivity that can easily be adjusted.
Reading MrRoboto’s recent post a second time, I see the sensitivity to the past 6+ games can be set to essentially make the current ELO rating (at 12/31/XX) reflect the results of the past 12 months, and the objective is met.

So then would we just want another set of results/rankings that has a lower sensitivity, like for a life-ELO number? If the same sensitivity, K factor, is used and the last, say, 6-10 results are what mostly determine the current ELO rating, then older results are nearly irrelevant?

Just trying to understand

MrRoboto

@pacifiersboard said in Proposal for a new, ELO-based, ranking system:

thx! I meant a copy of the ELO file that is enabling some (local) experimentation

You can create your own copy.

File -> Make a copy

Stucifer

@gamerman01 They aren’t irrelevant, but they do factor in less over time generally speaking. They are important because it tries to calibrate you to your appropriate rating quickly, then from there it is more of a maintenance process unless there is relative improvement.

At the extremes of the ELO Rating, even current games are worth little unless there is an upset. If a 1900 plays 1100 and wins they will not go up very many points, as it is expected. But if the 1900 loses the 1100 will receive a huge boost and the 1900 will fall quite significantly. This does reduce the incentive for the highest-ranked players to play the lowest-ranked players, but that was already the case.

I am not sure how, but think it would be awesome if we could implement a possible Bid allowance into drastically different ELO games. Say a 500+ rating difference could get double the usual bid but have the game be worth half the points. This might promote playing between the extremes in skill levels, if useful.

MrRoboto

@mr_stucifer has summarized it perfectly. You clearly understand some math!

I agree, the incentive is not very high to play a lower-skilled player. However, you WILL gain points, if you win. You just need to be ready to take that risk.

Remember, I can always change how huge the impact of an upset is, by lowering or increasing the F-Factor. Right now it is at 500, which means that the system expects a player with 500 more rating than another player to win in 90% of the cases.

A lower F-Factor would squeeze everyone closer together so the difference between #1 and the last player is lower. The number of points lost when the worst player wins against the #1 still remains the same however. So the gained/lost points relative to the total amount is a higher%
So upsets hurt the better player more and will help the worse player more.

On the contrary, a higher F-Factor would increase the extreme ELO-Ratings at the top and bottom so the points lost/gained have not as big of an impact.

With the current F-Factor of 500, the difference between first and last player in BM4-rankings is 966 points. So the system expects then #1 player to win in 98,8% of games against the last one. Which sounds about right if you ask me.
Play this matchup 100 times and #1 will gain exactly 1 point 99 times and lose 79 points once.
The expected outcome is still positive for the better player…

MrRoboto

The system becomes stable with more games played, but still flexible enough to allow for adjustments when a player improves.

If your skill stays the same, you will oscillate around your “correct” ELO-Rating, gaining some points, losing some but always hover within a certain corridor around your skill level.

Your chance of breaking out of that corridor is when you actually improve your skill.

This is different than PPG, which becomes a lot more stable with many games.

If you play 50 games and you have, for example, a PPG of 4, that means you have 200 points.
Even a win against the best player around would increase that total to 208 points, but your PPG increases only to 4.08.

With the new system it doesn’t matter if you play 30, 50 or 100 games. After a certain threshold is reached (when the system “Found” your correct place), your elo rating will not “solidify” more. It might only become more accurate in finding the correct spot.

That’s why the K-factor is so important: We want to reach that threshold as fast as possible. I think after around 15-20 games every player is where he/she should be. That’s a lot for a single year, but not for multiple years.

MrRoboto

@mr_stucifer said in Proposal for a new, ELO-based, ranking system:

I am not sure how, but think it would be awesome if we could implement a possible Bid allowance into drastically different ELO games. Say a 500+ rating difference could get double the usual bid but have the game be worth half the points. This might promote playing between the extremes in skill levels, if useful.

I support this idea. And I think together we can find the sweet spot.

My idea would be to take the average bid up until that game.

And then for every bid above that average, the ELO change could be multiplied by 2%.

Right now in BM, the Allied bid is 18.3 on average.

If I play Axis and give my opponent +30, thats 12 more than average.
If we are at the same ELO level, I would usually gain 40 for a win or -40 for a loss.

Factoring in the bid, I would gain 401.12 = 49.6 (so 50) points, I would only lose -400.76 = 30.4 (so 30) points for a loss too.

An example for players with different ELO rating:

A 1800 wins against 1500.
Ratings change +16 and -16.

With a bid that 12 higher than average, that changes to:

+20 and -12

Or a 1800 loses against a 1500
Usually that is +64 and -64.

But with a bid 12 higher than average, that changes to
+80 and -49

Do you think that 2% per bid is too high or too low? Quite right?

MrRoboto

@MrRoboto said in Proposal for a new, ELO-based, ranking system:

That’s why the K-factor is so important:

To be precise: The difference of the K-rating between the first and later games is the important part. That total value of K is not necessarily very important.

Right now K changes from 120 (first 3 games) to 80 in later games. So the first 3 games give 50% more points than later ones.

Lowering the 80 to a lower number would not only enhance the impact of early games compared, it would also narrow said corridor. You would oscillate a bit less around your “correct” rating.

If the difference between the early and the last games is too high, players with slightly more than 10 games might be far off from their correct spot when a couple of those early games are outliers.

Stucifer

@MrRoboto Hmm, I like the idea of having the bid tied to the average, does bidding in your experience generally start slightly above average? 2% per IPC seems like it might be a little high, at least for OOB with such a high average bid, going from 40 -> 60 bid might not need to be worth 40% more points.

But in PtV going from 6 to 26 seems like it should be worth 40%. I am not very fluent in BM so I will not hypothesize. Perhaps some sort of relative scale is needed on this aspect as well.

MrRoboto

That is a valid point!

It has to be relative to the absolute amount.

And I made a slight error in the calculations above.

If I lose despite getting a higher bid, my loss will be more significantly of course.
The numbers are correct, but I mixed it up a little for winners/losers.

Let me think about a formula that factors in the absolute amount, but now I have to go to bed.

MrRoboto

Just entered the criteria of 6 games completed this year.

Only 4 players in PtV
5 players in OOB
12 players in BM4

Maybe we could reduce the requirement to 4 or 5 games completed?
Or we could see how many more games are being finished in these last 2 months

pacifiersboard

@MrRoboto

I am sorry indeed for bothering you with this request, but it is not giving this option to me. @mr_stucifer, does it work for you now?

farmboy

@MrRoboto it is now 6 for BM and 3 I believe for oob and PTV. Gamerman could confirm the numbers.

pacifiersboard

@MrRoboto

imo, bids should keep these two functions:

determining sides by
balancing between Allies and Axis (according to players’ individual preferences)

Balancing between different skills I see already covered by usual ELO system. Additional handicaps can hardly easily be ruled without distortion, so can be left to sportsmanship and fun’s cause without reflection in ELO?

Considering the incentive issue we already got that

top players are already “forced” to encounter weaker players as part of the play-offs
players with many games have to pick at least three different opponents in order to be eligible for more matches. Maybe this setting is appropriate to be modified for more alternation?

Stucifer

@pacifiersboard I made my own document in google sheets, with the 6 columns, entered data and shared the document with Roboto. Then let him copy over onto the master

Adam514

@MrRoboto said in Proposal for a new, ELO-based, ranking system:

@mr_stucifer said in Proposal for a new, ELO-based, ranking system:

I am not sure how, but think it would be awesome if we could implement a possible Bid allowance into drastically different ELO games. Say a 500+ rating difference could get double the usual bid but have the game be worth half the points. This might promote playing between the extremes in skill levels, if useful.

I support this idea. And I think together we can find the sweet spot.

My idea would be to take the average bid up until that game.

And then for every bid above that average, the ELO change could be multiplied by 2%.

Right now in BM, the Allied bid is 18.3 on average.

If I play Axis and give my opponent +30, thats 12 more than average.
If we are at the same ELO level, I would usually gain 40 for a win or -40 for a loss.

Factoring in the bid, I would gain 401.12 = 49.6 (so 50) points, I would only lose -400.76 = 30.4 (so 30) points for a loss too.

An example for players with different ELO rating:

A 1800 wins against 1500.
Ratings change +16 and -16.

With a bid that 12 higher than average, that changes to:

+20 and -12

Or a 1800 loses against a 1500
Usually that is +64 and -64.

But with a bid 12 higher than average, that changes to
+80 and -49

Do you think that 2% per bid is too high or too low? Quite right?

I don’t think rating changes should be dependent on bid. Both players consented to that bid, presumably for balance reasons. It’s what both players are satisfied playing with. No need to put a rating factor on that.

Stucifer

@Adam514 The primary impulse behind this idea, for me:

Find a way for high-rating players such as yourself to play lower-rated players like me, give them a generous bid, but not lose as many points if they lose.

I learn the most from games I lose and playing high-rated players is useful for me to learn stronger strategies. But I also would probably get walked over with a normal bid. If I underbid that opponent on an average bid and they accept they will steamroll me, but if I accept the average bid they will also steamroll me. So there is an inherent difficulty in finding a bid that works when there is such a large skill difference.

Stucifer

Unless the player is generous and gives me a larger than average bid from the get-go. What I’m suggesting is that if there was at least a mild incentive to do so, it might happen more often. Not to call anyone out, but entering over 500 games I see people playing the bottom-ELO players and half the time that player gets a below-average bid, and they are always losing those games.

The new ELO-based ranking system

Featured Topics

T-shirts, Hats, and More

Suggested Topics

AD (L+28) vs FB (X +10) #4 BM

L24 OOB Booper(X) v Koala(L+45)

L24 BM4 Amon-Sul (Axis) vs Fasthard (Allies+24) - Game 5

L24 PtV Stucifer (X+10) vs Adam514 (L)

Adam514 (Allies) vs MikawaGunichi (X+11)

L24 BM4 Surfer (Allies+23) vs Avner (Axis)

L24 BM4 – Me1945 (Axis) vs. axis-dominion (Allies+21) match 3

L24 OOB Gamerman01 (Axis) vs FlyingBadger (Allies+51) #1

295

17.3k

39.8k

1.7m