IFDB games sorting and Alternative Top 25 (formerly 100)

Denk · January 25, 2023, 9:33pm

Thanks @StJohnLimbo, I think this is very useful! Still wish there was an optional unweighted average sort implemented on IFDB…

Denk · January 25, 2023, 9:39pm

Not a bad idea. Still think it would be easier for new users to select “unweighted average”. As "highest rated first"is the default, most new users will not think about it (until they start seeing that games with more stars are placing lower, confusing them). Those who do care, should have that option. You could even have a pop up warning if you like.

EDIT:
Don’t you even think unweighted average could be optional?
(rephrasing my question as it could be misunderstood. I just got the impression that you are in charge of IFDB changes and would like to know if this is out of the question.)

dfabulich · January 25, 2023, 9:55pm

I think “average rating” doesn’t actually solve the problems you’re trying to solve; I’m skeptical that it solves any problem at all.

New games are highlighted on the IFDB home page for a while. During a competition, all new games get some reviews. Overlooked games are a problem, but starsort is the best way to find those, too. Average rating isn’t the best way to find anything interesting, as far as I can tell.

If you, personally, want to look at the unweighted average ratings, you might enjoy doing this.

In your favorite browser, navigate to ifdb.org, open a JavaScript console, and paste in this code.

const xml = await fetch('https://ifdb.org/search?searchfor=%23ratings%3A5-&xml&pg=all').then(r => r.text());
const doc = new DOMParser().parseFromString(xml, 'application/xml');
const results = [...doc.querySelectorAll('game')].map(game => {
    const result = {};
    for (const child of game.children) {
        if (child.tagName === 'published') {
            result.published = game.querySelector('machine').childNodes[0].wholeText;
        } else {
            result[child.tagName] = child.childNodes[0]?.wholeText;
        }
    }
    
    return result;
}).sort((a, b) => b.averageRating - a.averageRating);
document.body.innerHTML = "";
for (const result of results) {
    const div = document.createElement('div');
    const a = document.createElement('a');
    a.href = result.link;
    a.appendChild(document.createTextNode(result.title));
    div.appendChild(a);
    div.appendChild(document.createTextNode(` Average: ${result.averageRating}, Num Ratings: ${result.numRatings}`));
    document.body.appendChild(div);
}

That will show search results sorted by average rating. The top 5 games have a perfect 5-star score. You can increase the number of ratings by editing the search URL in the code. It’s currently:

https://ifdb.org/search?searchfor=%23ratings%3A5-&xml&pg=all

Which is to say, it’s a search for games with at least five reviews. (%3A means a colon.) You can edit the %3A5- to %3A8- to exclude the games with 7 perfect reviews.

Here’s what I see:

Cragne Manor Average: 4.8889, Num Ratings: 18
Blow Your House Down Average: 4.8889, Num Ratings: 9
The Spectators Average: 4.8889, Num Ratings: 9
Father Leofwine is Dead Average: 4.8750, Num Ratings: 8
Counterfeit Monkey Average: 4.8357, Num Ratings: 207
Stay? Average: 4.8000, Num Ratings: 10
Hadean Lands Average: 4.7818, Num Ratings: 55
Fallen Hero: Rebirth Average: 4.7778, Num Ratings: 9
Heart of the House Average: 4.7778, Num Ratings: 9
Jolly Good: Cakes and Ale Average: 4.7778, Num Ratings: 9
Of Their Shadows Deep Average: 4.7692, Num Ratings: 13
Price of Freedom: Innocence Lost (expanded 2019 version) Average: 4.7692, Num Ratings: 13
Worldsmith Average: 4.7586, Num Ratings: 29
Dreamtruder Average: 4.7500, Num Ratings: 8

But… these results are not important. Should someone review Cragne Manor? Before Blow Your House Down? Before The Spectators?

IMO, clearly: no.

The searches I posted earlier, with e.g. #ratings:0-10 absolutely returns better results for anyone looking for overlooked games.

Denk · January 25, 2023, 10:13pm

Well, I just used the words " zero standard deviation" because it was faster than saying games where everyone gives the same rating - not much math there. When I think of it, I am not even sure I like this simple approach with two groups. I do like my Alternative Top 100 which excludes games with few ratings (exactly today it would exclude games with less than 8 ratings). But from the view point of IFDB they will have to list ALL games and then we have to decide what to do with the excluded titles. Where do they come in?

That question I haven’t found a good solution to. However, often users do searches with perhaps hundreds of hits and they will often only look at the top result. For that purpose, I think users should have the option of unweighted average. Anyone who doesn’t care will not notice and it should be simple to implement unweighted average.

I agree that we disagree
But to me, saying that sorting by the unweighted average has no value is like saying that sorting by the number of ratings has no value just because it isn’t statistically the best games you get. I think it was @mathbrush who at some point mentioned that he sometimes sorted by the number of ratings to find a good game. Just because the unknown best method is a mixture of ratings and the number of ratings, I still think the simple methods should be available. You have already 12 sorting methods, why not add a 13th? (oh - I forgot )

dfabulich · January 25, 2023, 10:48pm

One of the reasons I implemented starsort in IFDB was to replace sorting by Average Rating with something useful.

I investigated whether to use IMDb’s algorithm (which Pegbiter uses) or to use Evan Miller’s “starsort” algorithm, and settled on starsort.

I found that starsort returned, in my opinion, better results than IMDb, and I showed it to some other folks and they generally agreed.

IMDb’s algorithm overrates games with a high average rating that just happen to have more than the minimum “m” number of reviews. (Pegbiter’s algorithm hard-codes a minimum number of reviews. “m = minimum amount of ratings required (13)”)

Specifically, looking at Pegbiter’s list, I don’t think Worldsmith should rank higher than Lost Pig.

One of the features that starsort has is that it effectively sorts not just by rating but by popularity, effectively capturing how likely users are to recommend a game to others. Lost Pig reliably figures in lists of “Best IF of all time,” “Best games for newcomers,” etc. It’s a highly recommended game. Worldsmith is a good game, for sure, but not as many people would recommend Worldsmith over Lost Pig, and that’s reflected in the fact that Lost Pig has 454 ratings and Worldsmith only has 29.

An extra bonus: starsort works on any number of games with any number of reviews. There’s no way to apply Pegbiter’s algorithm to #ratings:0-10 because, of course, all of these games have fewer than 13 ratings.

I honestly think that starsort is, in fact, the best method. It might seem complicated, but it’s asking and answering the question “what games can we be confident are the best, based on the available evidence?”

That would make the sorting dropdown harder to use. There would be two confusingly similar options in the dropdown, and we’d somehow have to explain the difference between them.

If it were useful for something, that might be worth it, but if it’s just so someone (you?) can generate an annual report, well, you can generate a report using the code I’ve given you.

It’s worth discussing (which is why I’m engaging in the discussion), and, if lots of folks agreed that sorting by average rating was useful, I’d implement it.

I predict that won’t happen, but I’m open to seeing what others think.

Denk · January 25, 2023, 10:55pm

Dan Fabulich:

Here’s what I see:

Cragne Manor Average: 4.8889, Num Ratings: 18
Blow Your House Down Average: 4.8889, Num Ratings: 9
The Spectators Average: 4.8889, Num Ratings: 9
Father Leofwine is Dead Average: 4.8750, Num Ratings: 8
Counterfeit Monkey Average: 4.8357, Num Ratings: 207
Stay? Average: 4.8000, Num Ratings: 10
Hadean Lands Average: 4.7818, Num Ratings: 55
Fallen Hero: Rebirth Average: 4.7778, Num Ratings: 9
Heart of the House Average: 4.7778, Num Ratings: 9
Jolly Good: Cakes and Ale Average: 4.7778, Num Ratings: 9
Of Their Shadows Deep Average: 4.7692, Num Ratings: 13
Price of Freedom: Innocence Lost (expanded 2019 version) Average: 4.7692, Num Ratings: 13
Worldsmith Average: 4.7586, Num Ratings: 29
Dreamtruder Average: 4.7500, Num Ratings: 8

But… these results are not important. Should someone review Cragne Manor? Before Blow Your House Down? Before The Spectators?

Whether some results are important is a matter of taste. I don’t see what is wrong with Cragne Manor. It was given 5 stars by 17 people and only 3-4 were authors (one author excluded his rating). Only two people gave less.

It is my impression that ChooseYourStory have pushed up their ratings as a group - not saying they do not give their rating in good faith. I am not going into that, just saying that this wasn’t the case when I made the Alternative Top 100 in 2020.

Denk · January 25, 2023, 11:09pm

Maybe starsort is better than Pegbiter’s algorithm. I was simply pointing out, that they all have different assumptions and we don’t know if any of the assumptions are close to true. We can’t prove that one is better than the other just because you showed it to other people.

This statement is clearly subjective. My opinion, thus also subjective, is that Worldsmith is a masterpiece and that Lost Pig is a good game. 57% of voters gave Lost Pig five stars but 86% of voters gave Worldsmith five stars. To me, this is important. So what is most important? That those who played it liked the game so much or that more people played Lost Pig?

Denk · September 18, 2023, 7:43pm

In case anyone is interested, I have now added the 2023 edition on IFDB:
2023 Alternative Top 100

(the 2020 edition can be found here: 2020 Alternative Top 100 )

As these lists includes games with as few as 5 ratings and is based on average ratings, you should mainly see these lists as promoting promising games (based on ratings) alongside more established games.

Afterward · September 19, 2023, 4:24am

I think this list is magnificent. The algorithm that generated it must be fantastic.

Denk · February 3, 2024, 4:32pm

Published a new edition:

Decided to limit it to 25 games going forward.