The new ELO-based ranking system

oysteilo

I say everyones rating will increase. I mean “everyone” or most player.

gamerman01

The loser’s ELO drops more than the winner’s gains if the winner’s ELO is higher than the loser’s

I think you can access the “results” tab
Type in a player’s name in the yellow box and I think you’ll be able to see

It’ll be more efficient if @MrRoboto answers the rest

MrRoboto

We have 4489 games finished now.

The Average rating is 1486,77
The Median rating is 1459.

So as you can see, everyones rating hasn’t increased and a brand new player with 0-0 is even better than average ;-)

Arthur Bomber Harris

Something is wrong if the average player is worse than the new player since this has been incorrect from personal experience.

I looked through the data to see the apparent ELO of new players and they have been averaging around 1360 meaning they beat people ranked above this number the same frequency as they lost to people ranked below this number in their first game. I excluded the first few years of data as everyone was fresh in the League. Not the most rigorous mathematical calculation and I am sure you could be more accurate in the numbers.

Unfortunately this is a bit of a recursive calculation if you wanted to do this properly; adding in new people at ELO=1360 will lower the ELO of the entire community meaning you have to again adjust the ELO of newbies. In the end you probably will get around 1330 as the best estimate of a fresh person joining the League, but I would be interested if you could do a more thorough evaluation.

MrRoboto

You are correct of course.

I meant someone who joins the league is 1500, before finishing a game and therefore on paper better than average.
But most people seem to start with losses. I could get the correct data for that (and might find out that I am wrong with that hypothesis) but frankly am too lazy so a rough estimate is looking at people who currently have exactly very few games finished.

Out of 37 players who have completed a single game, only 8 have won that single game while 29 have lost it.

Out of 16 players who have completed exactly two games, NONE has won both and only 6 of them went 1-1 while 10 have 0-2.

Out of 20 players who have completed exactly three games, two have won all 3, 6 players have won 2 out of 3, 4 went 1-2 and 8 out of 15 went 0-3

So we have 71 players with 1-3 completed games and only 14 out of 71 have a rating of 1500 or higher.

I suppose it’s safe to say that new players tend to be worse than average - which shouldn’t be surprising.

Arthur Bomber Harris

Do you want to adjust the starting value of ELO down a bit in your spreadsheet so there isn’t incentive to pick on noobs to increase ranking? Somewhere in the mid-1300s would be more appropriate?

I doubt there are many stellar players who are joining the League and will dominate veterans during their initial matches. Gameplay in this forum is so much higher than the casual gamer who thinks OOB is actually balanced sans bid.

MrRoboto

This wouldn’t have any effect besides everyones rating going down.
In fact, it took me merely 5 seconds to change the starting ELO to 1300 and the result is:

Average rating is 1286
Median rating is 1259.

As you can see, the average / median is exactly 200 lower than before…

MrRoboto

Besides, we WANT to encourage players to take on fresh new recruits joining this community.

We even discussed a flat bonus for playing against a new player (with new being defined as having a maximum of 3 completed games), but ultimately decided against it.

I don’t have an issue with someone who tries “farming” new players. Yes, chances are these new players are overrated at 1500 and should on average be much lower, but remember: You DO risk facing an undiscovered elite player who could be a 2000 in disguise. Losing to that strong player would also give lots of negative rating since he / she is massively underrated at 1500 as well.

gamerman01

I have a very similar concern as ABH after entering a couple thousand games.
I really appreciate the analysis (that MrRoboto quickly did) on players with 1-3 games complete (we currently have it set so the sensitivity of the first 3 games is higher). That they significantly underperform.
Correct, the solution using this system is not to change the starting ELO number, and thanks for verifying that 100%, MrRoboto

I tend to think maybe it’s not too big a deal for us who keep playing, because you play a few more people who aren’t unknown and your rating starts to correct. I’ve been busy entering games (and I’m deep into 2012 when things were just starting up, so I’m super close), so I haven’t looked hard at how fast a player would correct after bottom feeding a little.

I wouldn’t be surprised if MrRoboto thinks of some adjustment, though one is not necessarily needed IMO

We do have a sensitivity scale in place that we haven’t yet really played with or discussed yet. That’s another discussion. Currently we’re most interested in getting a new player up to an appropriate ELO quickly (like within about 6 games) so that a good player would not be significantly under-rated when entering the playoffs in year 1. I suppose this relatively higher sensitivity in the first 3 games would also make a significantly below average player drop faster. So it’s hard to feast on the new guys given we don’t have too many new ones come in, in a given year, and if their rating quickly drops from game 1, 2, 3

Not my clearest writing, there, my apologies. I’m tired, but I wanted to respond

Arthur Bomber Harris

@MrRoboto I had a cutoff date for looking at noobs, grandfathering in people who started the earlier years with a higher elo rating at start since most of the talent had been here for such a long time.

Newer players who began more recently had a lower assigned starting ELO since we appear to be attracting less talented people now instead of hardcore G40 whizkids. It isn’t the fairest of systems but does match the reality of the situation.

oysteilo

I still struggle here.

These are the numbers at the bottom of the “data” spreadsheet and show how much rating has changed depending on prior rating and who actually won the game. For the five first entries the increase in rating and decrease in rating is the same. To me this makes sense and the relative change is given by difference in original rating and who won the game.

Some of the other entries seem more random (I am sure there is a reason though). In one of the games the winner gained 58 points whereas the loser lost 37 points. How does this relate to the first 5 entries? I think this is important to understand. It is not obvious to me. Please explain.

oysteilo

I am making different posts. Here is another one… It is mostly just an observation and not any criticism. Using myself as an example I am currently rated no 3 in the elo system for the OOB playoffs starting 01.01.2024 as I have finished the required 3 game cap.

In gamerman01*s spreadsheet I am currently tied for 8 th place and it is give or take if I will qualify for the top bracket or not

This tells me that my results in 2022 (and maybe in 2021) stil count considerably towards my current elo rating.

This might not be a problem, but it also may indicate a higher barrier for new players to qualify for the top bracket. I know there has been som discussions around how to place new players rapidly.

Maybe, it is appropiate that the designing crew give a highlight of how the elo system works both with new players vs old players and how this relate to the 3 game cap and prior years influence. Also see my post above about about the sum of gain/loss not always equals zero

Finally, again this is not negative thoughts. I just dont fully understand how everything comes together and would like to understand better before the change becomes active on 01.01.2024

MrRoboto

I will address everything tomorrow. Today my whole family comes over for christmas.

Speaking of christmas:

Merry christmas to everyone celebrating it! And happy holidays to everyone else.

AndrewAAGamer

I thought we agreed, in earlier discussions, that we would have one historic rating for fun and interest and one current rating for each new year? That way this year’s playoffs are only based on games played this year.

Using data from before the current playoff year makes the system more like golf or tennis were Players are rated and seeded accordingly based on a rolling average that dates back 2 years (Golf) or one year (Tennis). Except our rolling average would date back to whenever the player started to play. So good play over time is going to result in higher playoff seedings versus someone who had medium or bad play over time even though both players had the exact same playing experience in the current year.

I would prefer a system more like football, basketball, baseball and hockey that starts everyone off at zero for the new year. The worst team in the league and the best team in the league from the previous year all start over even in the new current year.

My two cents…

gamerman01

Very well said, happy to have 2 respected players weigh in.
I’m going to wait until MrRoboto can answer some of these (I don’t know why ELO changes are the same for victory and defeated after game 1)

Part of the answer is that at this point the sensitivity ratings were somewhat arbitrarily put out there by MrRoboto, and I somewhat arbitrarily changed them to see what would happen.
That is, on the data sheet to the right, we currently have 110, 90, 70, 50 at threshholds 3 games, 6 games, 10 games (3 and 6 of course have become accepted league threshholds, 3 for getting a firm tier, 6 for qualifying for BM)

The matter of a new player getting a reasonable seat at the playoff table after finishing minimum # of games for his very first year is fairly easy to address, I think.

You (plural) raised other points and every sentence is appreciated. I do not consider this post here a complete answer, but some thoughts in response to some of your thoughts that I think will help advance the discussion.

MrRoboto

First of all, I can’t quite follow @Arthur-Bomber-Harris last post. Sorry, but I don’t understand what you’re trying to say there.
All I can say is, that starting ELO is the same for everyone.

Then the easy answer to @oysteilo first post:
Winner and loser points may differ, when one of them (or both) has less than 11 completed games.

This is because the first couple of games weigh more than later ones. I accomplish that with a “K Factor”, which expresses the amount a game is worth.
That K factor is really high for the first couple of games and then gradually decreases. As @gamerman01 said, the exact values are chosen “arbritrarily”, although of course we thought long and hard about them.

That K factor is essential, because new players all start at 1500, which is almost guaranteed not a perfect rating for them (most newbies are worse than that average, but some might be a lot better too). So we need the system to move new players as fast as possible to where they belong. Normally, a game between two equals awards only 25 (or -25) points. If a new player is actually a 1900 or a 1100 player, those 400 points climb or descend would take a long time without that sensitivity Factor.

Right now, the K factor we settled on is:

As you can see, the first 3 games are worth a little bit more than double the later games.

We can talk about these values, they are not set in stone. But I want to emphasize the importance of bringing new players to their appropriate rating asap. The only alternative would be placement matches - this would mean that new players are not rated at all for their first 5-10 games and opponents would receive / lose just a fraction of the normal worth. I don’t like this option for us.

Now the other question concerning playoff ranking / seeding.

@AndrewAAGamer is correct, originally I planned to use the current year only for playoff seeding. He said he also prefers this and compared it to football, basketball, baseball and hockey.

Now that comparison has one gigantic flaw however: All of these sports are in a league system where every participant has a fixed number of games and the exact same opponents. So it makes sense to start a season with a clean slate.

However, this is not the case with our community. With OOB and PtV, players need to have 3 completed games, with BM4 they need 6. But 3 games are not nearly enough to properly rate a player, especially not if one of those games was an upset (an unexpected loss / win). If we had an entry requirement of 10+ games per year, I’d definitely go for a clean slate every Jan 1!

Our game is more like the other two sports you mentioned (golf or tennis), even though they are still a bit different: It’s impossible to burst onto the scene as a complete nobody and expect to participate in the biggest tournament with only 3 games completed. A first time participant of a Grand-Slam-Tournament has proven himself/herself over many matches beforehand in smaller tournaments. We don’t have that luxury.
I think our sports can probably best be compared to boxing, where everyone chooses their opponents and some have only very few matches per year, while others have some more.

I can change the system to rank playoff seedings only according to results in the current year. Which would mean everyone starts at 1500 (for playoff ranking only). But do we want that?

@oysteilo actually gave a great example!

Oysteilo started the year with OOB rating of 1669
He is 2-1 this year, with both of his wins being almost worthless (against dawgoneit), giving him only +3 each. He lost once against the #1 AndrewAAGamer for -16. Which gives oysteilo a final rating of 1659. Which is more or less the same rating he had for the last 7 years.

ArthurBomberHarris started the year with OOB rating of 1542
He went 5-0 this year, although only one of his opponents was really strong (he defeated #1 AndrewAAGamer!). He gained 110 in the process, which means he significantly improved his rating from 1542 to 1652. This is the highest OOB rating he ever achieved.

They are now almost identical in rating (1652 and 1659), with 32 or 41 total completed games. Which means the rating is very reliable, those two players are very likely extremely similar in strength.

Now it’s a personal decision: Do you think oysteilo should get the higher seed because of his slightly higher rating and the fact he mainainted roughly that rating for 7 years? Even though his rating basically stagnated this year? Then the system should stay as it is.

Or do you think ArthurBomberHarris should get the higher seed because he is on an upward trajectory this year? Remember, Arthur is probably not better than oysteilo (they are most likely equally strong right now) and it is the first time he achieved the same level as oysteilo. But if you think the improvement he showed this year is worth more the system should change to let everyone start at 1500 on Jan1.

Do you think AndrewAAGamer who sits comfortably at #1, with only farmboy being SOMEWHAT close should get the top seeding? He went 7-2 and increased his OOB rating from 1798 to 1830.

Or do you think the 5-0 of Arthur and the 4-0 of Booper this year is more impressive and should give both of them a higher seeding, despite both of them definitely being not as good as Andrew?

My personal preference is the first option, which is currently implemented.
But I can absolutely understand if you value recent results higher than overall strength! This is a system that should be backed by the majority of the community so please: WEIGH IN!

oysteilo

Well explained @MrRoboto . I support your recommendation here

Arthur Bomber Harris

I also support your recommendation of multiyear ELO for playoff spots in the playoff. I am clearly not the strongest player going into the tournament: I needed a fair bit of luck to beat Andrew. I am an average player over my career and did not improve this year. No need to give me the #1 seed.

My main comment is to have everyone who signs up have a spot in the main tier of the playoff instead of having an upper and lower bracket. That won’t be a problem for OOB this year, but give players a chance to get the crown as a medium guy can sometimes pull enough upsets to become #1. Perhaps more of an issue for BM playoffs.

MrRoboto

Good point, @Arthur-Bomber-Harris , I also support a single tournament tree instead of having different brackets. But maybe I don’t see the merits of having multiple brackets, perhaps @gamerman01 or someone else can enlighten me.

oysteilo

@MrRoboto said in Proposal for a new, ELO-based, ranking system:

Good point, @Arthur-Bomber-Harris , I also support a single tournament tree instead of having different brackets. But maybe I don’t see the merits of having multiple brackets, perhaps @gamerman01 or someone else can enlighten me.

The main reason to have brackets the way they are is mainly a time issue. The champion should be “crowned” before a new play off season starts. If I am not mistanken the BM champoinship is currently ongoing… Clearly, we wont have multiple playoff season going on side by side.

The new ELO-based ranking system

Featured Topics

T-shirts, Hats, and More

Suggested Topics

L24 bm4 axis-dom (axis) vs GeneralDisarray (allies+19)

PTV Daaras (x+12) vs General Disarray

MikawaGunichi (X+12) vs ArtofWar Game #3

L24 OOB GeneralDisarray (X) vs AndrewAAGamer (L+60)

L24 PtV Stucifer (X+10) vs Adam514 (L)

L24 BM4 Surfer (Allies+23) vs Avner (Axis)

L24 BM Stucifer (X) vs Surfer (L+20)

pacifiersboard (X) -vs- surfer (L +21) BM

273

17.3k

39.9k

1.7m