Is the Condorcet method suitable for IFcomp?

Denk · January 31, 2020, 1:01pm

This thread is for discussing the details about the Condorcet method for the purpose of applying it in the IFcomp, since the Condorcet method might take up too much space in this thread.

Denk · January 31, 2020, 2:11pm

I haven’t learned how to deal with blanks in matrices before but from your posts, I understand that it corresponds to treating unplayed games as neither higher or lower than any played game, which is what we want.

You referred to the Wikipedia article on ranked pairs which uses the assumption we want: “unstated candidates are assumed to be equal to the stated candidates.” so that’s good.

However, the main Wikipedia article on the Condorcet method did not mention this assumption. Instead it mentions this assumption twice: “Usually, when a voter does not give a full list of preferences they are assumed, for the purpose of the count, to prefer the candidates they have ranked over all other candidates.” This assumption would favor games which get played a lot over those which don’t.

Thus whoever applies the method on IFcomp data should understand this difference.

The main article mentions a matrix method which is rather simple. Unfortunately it is not described how to handle ties and unplayed games, i.e. unstated candidates.

In contrast the article on ranked pairs seems to apply some graph theory(?) to find the winner.

I don’t have the prerequisites when it comes to graph theory, so I am hoping someone can describe the corresponding matrix method, which I believe more people can understand.

Zarf, I understand that from one of your posts, that the simple matrix method shown in the main article in the section “Pairwise counting and matrices” can be applied if the unplayed games are treated correctly, i.e. you leave the relevant matrix fields blank, which is different from using zeros somehow.

Zarf, using the matrix example from “Pairwise counting and matrices” is there any chance you could explain what would happen to the matrices below if voter 1, who made ballot 1 below, had not played game D?
Thanks

Ballot 1:
	A	B	C	D
A	—	0	0	1
B	1	—	1	1
C	1	0	—	1
D	0	0	0	—

Sum matrix (3 voters):
	A	B	C	D
A	—	2	2	2     6 points (winner)
B	1	—	1	2     4 points (3rd)
C	1	2	—	2     5 points (runner-up)
D	1	1	1	—     3 points (4th)

zarf · January 31, 2020, 4:10pm

If the committee decides to go down this path, that fool will be me. So okay.

I will look at your example later today, but I expect the answer will come out “D cannot be precisely ranked relative to the other entries – insufficient information.”

bikibird · January 31, 2020, 7:07pm

A little background on why I suggested pairwise voting:

Some years back i visited the website of a political party (which shall remain nameless). They were crowd-sourcing bumper stickers. You could submit your entry, but you could also judge others. You’d be presented with two bumper stickers at a time and you’d pick which you liked better. It was very easy to participate in judging and you could judge as many or as little as you liked.

Essentially, they were doing some sort of pairwise voting. I felt like I was participating in a crowd-sourced bubble sort, which I enjoyed. That got me interested in learning about pairwise voting and ran into the Condorcet method. My assumption was that they were doing Condorcet, but I’m not mathematically ept enough to say with any certainty.

What I’m hoping for is to crowd-source a ranking where no one judge has ranked all the entries. If I know A is better than B and you know C is better than A, then together we know the ranking is C, A, B. If Condorcet does this, then that’s what I’m looking for. Otherwise, it’s something else.

zarf · January 31, 2020, 7:19pm

Yes, that is in essence what Condorcet does.

The obvious stumbling block is when some voters say “A > B”, some say “B > C”, and some say “C > A”. The goal then is to break the cycle in the way that disappoints the fewest voters. There are a lot of slightly different ways to figure this out, so an actual plan would require some research and testing.

bikibird · January 31, 2020, 7:32pm

Definitely, vote tallying needs to be super transparent. If there is interest in pursuing this, I would say maybe continue with the old method for at least a year or two for the purpose of awarding prizes, but run the Condorcet method side-by-side. Maybe call the results of that the “crowd favorite”. If Condorcet is later found to be “better”, then go ahead and drop the old style for future contests.

zarf · January 31, 2020, 10:35pm

Answering the earlier question:

In the original example, we have three voters, each of whom ranks four candidates:

X: B C A D
Y: D A C B
Z: A C B D

(That is, voter X ranks B first and D last.)

The wikipedia page adds everything up with 0 meaning “loses to” and 1 meaning “wins over”. However, we want to allow for the possibility of “indifferent”, so I’m going to use -1 and 1, with 0 for “indifferent”.

The sum matrix now looks like:

    A   B   C   D 
A   0   1   1   1 
B  -1   0  -1   1 
C  -1   1   0   1 
D  -1  -1  -1   0

(To work through the math in the top right corner: voters X and Z rated A higher than D; voter Y rated A lower than D. So this entry is 1 + 1 + -1 = 1.)

This sum matrix gives us an unambiguous outcome. A beats every other candidate. If you remove A from the matrix, C beats B and D. If you also remove C, then B beats D. So the final ranking is A C B D.

You asked what happens if voter X skipped playing D. The sum matrix would then look like

    A   B   C   D 
A   0   1   1   0 
B  -1   0  -1   0 
C  -1   1   0   0 
D   0   0   0   0

Now we have a problem. A beats B and C, but is indifferent versus D. In fact D is indifferent versus each other candidate. (Because voter Y ranked D above them all, but Z ranked D below them all.) There’s no way to place D without disappointing one voter, so the whole ranking is stuck.

This sort of problem is likely when you have very few voters. If you have 50 voters, even if they’re very polarized about D, the D-top and D-bottom factions are not likely to be perfectly balanced. A slight majority will break the deadlock.

zarf · January 31, 2020, 10:51pm

An example with the “crowd-sourcing” effect:

U: C B A
V: B A D
W: B D C
X: B C A
Y: D A C
Z: A B D

    A   B   C   D 
A   0  -2  -1   1 
B   2   0   1   3 
C   1  -1   0  -2 
D  -1  -3   2   0

Here we have six voters, each of whom only played three games. But we can still say that B beats every other candidate. After that it gets messy – A, C, and D have a cycle. But D’s preference over C is strongest so we put D in second place. Then C beats A.

Denk · February 1, 2020, 12:19am

Thanks for both of your examples. I can see how this method does not favour games with many votes, so that is great. It was a smart move to apply 1, -1 and 0 to allow for indifference. It is also a positive thing, that the sum matrix once again can be found as the sum of each “ballot matrix”. Creating the sum matrix is therefore pretty simple. The last steps seem pretty complicated though, especially considering around 80 games. I suppose that it is necessary to write a complicated computer program to evaluate the various cycles, then rank some games, then remove the ranked games from the matrix, evaluate again etc. But I’m sure it is doable. It could be fun to see it applied on some IFcomp data.

mathbrush · February 1, 2020, 12:29am

I’m sure many, if not all Condorcet advocates are aware of this, but I just thought I’d throw out:

The impossibility theorem says that every voting method has some flaw that makes it undesirable by most people’s standards (such as a low-ranking game changing the rankings of unrelated higher-ranking games, or a majority of people liking game A over game B but A places lower than B).

In fact, the only ranking method that doesn’t miss any of these criteria is a dictatorship, which means Ryan Veeder’s exposition has the right idea.

crisostimo · February 4, 2020, 2:48am

It feels like there is strictly less information in this system than a 1-10 rating system, with the exception of not allowing ties (and I think having ties is a feature). I can already use a 1-10 rating to determine an order, but I strictly lose any sort of distance metric between ratings.

Meta question - are we actually solving an existing problem? Is there a survey given out to judges where we’ve seen a trend in responses that a 1-10 system is just too hard?

zarf · February 4, 2020, 3:26am

It’s true that there’s no way to indicate “my first place is much much better than my second place”.

On the other hand, the 1-10 system has problems with people’s scales not necessarily being comparable. Does your 5 mean the same thing as my 5, sort of thing. Also the rules-lawyer temptation to give out only 1 and 10 scores to maximize your impact on the results! Ranked-pair systems avoid all of that.

I’m not saying we are. The original thread was an open call for ideas. Most of those ideas won’t get implemented.

bikibird · February 4, 2020, 6:04pm

Ideally, the survey would be given to people who aren’t judging under the current system. What changes would bring more judges in? See this survey of one, IFComp 2019 Follow-Up Survey: Responses Requested by Feb 9th - #58 by bikibird, for an example of what might be discouraging some newcomers from judging.