Fast Tutorial

This notebook explains how to use the embedded_voting package in the context of epistemic social choice and algorithms aggregations.

In general algorithm aggregation rules (Average, Median, Likelihood maximization), you need diversity among the different algorithms. However, in the real world, it is not rare to have a large group of very correlated algorithms, which are trained on the same datasets, or which have the same structure, and give very similar answers. This can biais the results.

With this method, you don’t suffer from this correlations between algorithms. This notebook simply explains how to use this method.

First of all, you need to import the package:

[1]:
import embedded_voting as ev

Generator to simulate algorithm results

Then, if you want to aggregate algorithms’ outputs, you need to know the outputs of these algorithms. In this notebook, we will use a score generator that simulates a set of algorithms with dependencies.

In the following cell, we create a set of algorithms with \(25\) algorithms in the first group, \(7\) in the second group and \(3\) isolated algorithms.

[2]:
groups_sizes = [25, 7, 1, 1, 1]
features = [[1, 0, 0, 1], [0, 1, 0, 0], [1, 0, 1, 0], [0, 1, 0, 1], [0, 0, 1, 0]]

generator = ev.RatingsGeneratorEpistemicGroupsMix(groups_sizes,
                                                   features,
                                                   group_noise=8,
                                                   independent_noise=.5)

ratings = generator(n_candidates=20)
true_ratings = generator.ground_truth_
print(ratings.shape)
(35, 20)

The last command generates a matrix of scores that contain the outputs given by the algorithms to 20 inputs. If you use this method, you can provide the score matrix by putting your algorithms’ results in a matrix of shape \(n_{voters} \times n_{candidates}\).

Find the best alternative

Now, we can simply create an *Aggregator* object with the following line:

[3]:
aggregator = ev.Aggregator()

The following cell show how to run a “election”:

[4]:
results = aggregator(ratings)

Then we can obtain the results like this:

[5]:
print("Ranking :", results.ranking_)
print("Winner :", results.winner_)
Ranking : [2, 11, 5, 13, 16, 7, 18, 0, 3, 6, 12, 14, 1, 19, 9, 10, 15, 8, 17, 4]
Winner : 2

You will probably keep using the same Aggregator for other elections with the same algorithms, like in the following cell:

[6]:
for i in range(10):
    ratings = generator(20)
    print(f'Winner {i+1} : {aggregator(ratings).winner_}')
Winner 1 : 11
Winner 2 : 1
Winner 3 : 0
Winner 4 : 19
Winner 5 : 0
Winner 6 : 18
Winner 7 : 19
Winner 8 : 18
Winner 9 : 1
Winner 10 : 18

During each election, the Aggregator saves the scores given by the algorithms to know them better. However, it does not compute anything with this new data if it is not asked to do it.

Every now and then, you can retrain your Aggregator with these newest data. We advise to do it often where there is not a lot of training data and once you have done enough elections (typically, when you have shown as many candidates than you have algorithms), you don’t need to do it a lot.

To train your Aggregator on the newest data, do the following:

[7]:
aggregator.train()
[7]:
<embedded_voting.aggregation.aggregator.Aggregator at 0x2b789c15518>

You can also train it before an election using the data from the election by doing this:

[8]:
results = aggregator(ratings, train=True)

For the first election of your aggregator, you do not need to specify that train is True because the aggregator always do a training step when it is created.

Fine-tune the aggregation rule

If you want to go further, you can change some aspects of the aggregation rule.

The first thing that you may want to change is the aggregation rule itself. The default one is FastNash, but you can try FastLog, FastSum or FastMin, which can give different results.

We advise to use FastNash, which shows stronger theoretical and experimental results.

[9]:
aggregator_log = ev.Aggregator(rule=ev.RuleFastLog())
aggregator_sum = ev.Aggregator(rule=ev.RuleFastSum())
aggregator_min = ev.Aggregator(rule=ev.RuleFastMin())
print("FastNash:", aggregator(ratings).ranking_)
print("FastLog:", aggregator_log(ratings).ranking_)
print("FastSum:", aggregator_sum(ratings).ranking_)
print("FastMin:", aggregator_min(ratings).ranking_)
FastNash: [18, 1, 0, 13, 15, 14, 2, 10, 11, 9, 19, 7, 3, 5, 12, 4, 8, 17, 6, 16]
FastLog: [18, 1, 0, 13, 15, 14, 10, 2, 11, 9, 19, 7, 3, 5, 12, 4, 8, 17, 6, 16]
FastSum: [18, 15, 1, 0, 13, 14, 11, 10, 9, 2, 7, 19, 3, 12, 5, 4, 17, 8, 6, 16]
FastMin: [18, 1, 0, 15, 13, 14, 11, 2, 10, 9, 19, 7, 3, 12, 5, 17, 8, 4, 6, 16]

You can also use the average rule:

[10]:
aggregator_avg = ev.Aggregator(rule=ev.RuleSumRatings())
results = aggregator_avg(ratings)
print(aggregator_avg(ratings).ranking_)
[18, 15, 1, 0, 13, 14, 11, 10, 9, 7, 2, 19, 3, 12, 17, 4, 5, 6, 16, 8]

You can also change the transformation of scores. The default one is the following :

\[f(s) = \sqrt{\frac{s}{\left || s \right ||}}\]

But you can put any rule you want, like the identity function \(f(s) = s\) if you want. In general, if you use a coherent score transformation, it will not change a lot the results.

[11]:
aggregator_id = ev.Aggregator(rule=ev.RuleFastNash(f=lambda x,y,z:x))
print(aggregator_id(ratings).ranking_)
[18, 1, 13, 0, 15, 14, 10, 2, 11, 9, 19, 7, 3, 5, 4, 12, 8, 17, 6, 16]