1. My first Profile

In this Notebook, I will explain how to create a profile of voters with embeddings.

[1]:
import embedded_voting as ev
import numpy as np
import matplotlib.pyplot as plt

Build a profile

Let’s first create a simple profile of ratings, with \(m=5\) candidates and \(n=100\) voters:

[2]:
n_candidates = 5
n_voters = 100
profile = ev.Ratings(np.random.rand(n_voters,n_candidates))
profile.voter_ratings(0)
[2]:
array([0.9818839 , 0.50767343, 0.35742337, 0.4115904 , 0.51233348])

Here we created a profile with random ratings between \(0\) and \(1\). We could have used the impartial culture model for this :

[3]:
profile = ev.RatingsGeneratorUniform(n_voters)(n_candidates)
profile.voter_ratings(0)
[3]:
array([0.43593987, 0.95181078, 0.60167015, 0.42875782, 0.78548049])

We can also change the ratings afterwards, for instance by saying that the last 50 voters do not like the first 2 candidates :

[4]:
profile[50:,:2] = 0.1
profile.voter_ratings(50)
[4]:
array([0.1       , 0.1       , 0.97879522, 0.85108355, 0.82567096])

Now, we want to create embeddings for our voters. To do so, we create an Embeddings object:

[5]:
embs = ev.Embeddings(([[.9,0,.1],
                        [.8,.1,0],
                        [.1,.1,.9],
                        [0,.2,.8],
                        [0,1,0],
                        [.2,.3,.2],
                        [.5,.1,.9]]), norm=False)

embs.voter_embeddings(0)
[5]:
array([0.9, 0. , 0.1])

We can normalize the embeddings, so that each vector have norm \(1\):

[6]:
embs = embs.normalized()
embs.voter_embeddings(0)
[6]:
array([0.99388373, 0.        , 0.11043153])

You can also use an Embedder to generate embeddings from the ratings. The simplest one is the one generating the uniform distribution of embeddings :

[7]:
embedder = ev.EmbeddingsFromRatingsRandom(3)
embeddings = embedder(profile)
embeddings.voter_embeddings(0)
[7]:
array([0.76396325, 0.43197473, 0.47933077])

Let’s now create more complex embeddings for our profile

[8]:
positions = [[.8,.2,.2] + np.random.randn(3)*0.05 for _ in range(33)]
positions += [[.2,.8,.2] + np.random.randn(3)*0.05 for _ in range(33)]
positions += [[.2,.2,.8] + np.random.randn(3)*0.05 for _ in range(34)]
embs = ev.Embeddings(np.array(positions), norm=False)

There are several way to create embeddings, some of them using the ratings of the voters, but we will see it in another notebook.

Visualize the profile

Now that we have a profile, we want to visualize it. Since the number of embeddings dimensions is only 3 in our profile, we can easily plot it on a figure.

There are two ways of plotting your profile, using a 3D plot or a ternary plot :

  • On the 3D plot, each voter is represented by a line from the origin to its position on the unit sphere.
  • On the ternary plot, the surface of the unit sphere is represented as a 2D space and each voter is represented by a dot.

On the following figures we can see the red group of voters, which corresponds to the \(25\) voters with similar embeddings I added in the fourth cell.

[9]:
embs.plot("3D")
embs.plot("ternary")
../_images/notebooks_profile_21_0.png
../_images/notebooks_profile_21_1.png
[9]:
TernaryAxesSubplot: -9223371919785834117

You can also plot the two figures side by side :

[10]:
fig = plt.figure(figsize=(15,7.5))
embs.plot("3D", fig=fig, plot_position=[1,2,1], show=False)
embs.plot("ternary", fig=fig, plot_position=[1,2,2], show=False)
plt.show()
../_images/notebooks_profile_23_0.png

Visualize the candidates

With the same idea, you can visualize the candidates.

  • On a 3D plot, the score given by a voter to a candidate is represented by the size of its vector.
  • On a ternary plot, the score given by a voter to a candidate is represented by the size of the dot.

Use plot_candidate to plot only one candidate and plot_candidates to plot all the candidates. In the following plots, we can see that the blue group don’t like the first two candidates.

[11]:
embs.plot_candidates(profile, "3D")
embs.plot_candidates(profile, "ternary")
../_images/notebooks_profile_26_0.png
../_images/notebooks_profile_26_1.png

Beyond 3 dimensions

What if the profile has more than 3 dimensions?

We still want to visualize the profile and the candidates.

In the following cell, we create a profile with 4 dimensions.

[12]:
embs = ev.EmbeddingsFromRatingsRandom(4)(profile).normalized()

We use the functions described above and specify which dimensions to use on the plots (we need exactly \(3\) dimensions).

By default, the function uses the first three dimensions.

In the following cell, we show the distribution of voters with different subsets of the \(4\) possible dimensions.

[13]:
fig = plt.figure(figsize=(30,7.5))
embs.plot("3D", dim=[0,1,2], fig=fig, plot_position=[1,4,1], show=False)
embs.plot("3D", dim=[0,1,3], fig=fig, plot_position=[1,4,2], show=False)
embs.plot("3D", dim=[0,2,3], fig=fig, plot_position=[1,4,3], show=False)
embs.plot("3D", dim=[1,2,3], fig=fig, plot_position=[1,4,4], show=False)
plt.show()
../_images/notebooks_profile_31_0.png

Recenter and dilate a profile

Sometimes the voters’ embeddings are really close one to another and it is hard to do anything with the profile, because it looks like every voter is the same.

For instance, we can create three groups of voters with very similar embeddings :

[14]:
embeddings = ev.Embeddings([[.9,.3,.3],[.8,.4,.3],[.8,.3,.4]], norm=True)

If I plot this profile, the three voters are really close to each other:

[15]:
fig = plt.figure(figsize=(15,7.5))
embeddings.plot("3D", fig=fig, plot_position=[1,2,1], show=False)
embeddings.plot("ternary", fig=fig, plot_position=[1,2,2], show=False)
plt.show()
../_images/notebooks_profile_36_0.png

The first thing we can do is to recenter the population of voters:

[16]:
embeddings_optimized = embeddings.recentered(False)
[17]:
fig = plt.figure(figsize=(14,7))
embeddings.plot("ternary", fig=fig, plot_position=[1,2,1], show=False)
embeddings_optimized.plot("ternary", fig=fig, plot_position=[1,2,2], show=False)
plt.show()
../_images/notebooks_profile_39_0.png

Now, we can dilate the profile in such a way that the relative distance between each pair of voters remains the same, but they take all the space they can on the non-negative orthant.

To do so, we use the funtion dilated.

[18]:
embeddings_optimized = embeddings_optimized.dilated(approx=False)

As you can see on the second plot, voters are pushed to the extreme positions of the non-negative orthant.

[19]:
fig = plt.figure(figsize=(14,7))
embeddings.plot("ternary", fig=fig, plot_position=[1,2,1], show=False)
embeddings_optimized.plot("ternary", fig=fig, plot_position=[1,2,2], show=False)
plt.show()
../_images/notebooks_profile_43_0.png

Introduction to parametric profile generator

Our package also proposes an easy way to build a profile with “groups” of voters who have similar embeddings and preferences.

To do so, we need to specify :

  • The number of candidates, dimensions, and voters in the profile.
  • The matrix \(M\) of the scores of each “group”. \(M(i,j)\) is the score given by the group \(j\) to the candidate \(i\).
  • The proportion of the voters in each group.

For instance, in the following cell, I am building a profile of \(100\) voters in \(3\) dimensions, with \(5\) candidates. There are \(3\) groups in this profile :

  • The red group, with \(50\%\) of the voters. Voters from this group have preferences close to \(c_0 > c_1 > c_2 > c_3 > c_4\).
  • The green group, with \(30\%\) of the voters. Voters from this group have preferences close to \(c_1 \sim c_3 > c_0 \sim c_2 \sim c_4\).
  • The blue group, with \(20\%\) of the voters. Voters from this group have preferences close to \(c_4 > c_3 > c_2 > c_1 > c_0\).
[20]:
scores_matrix = np.array([[1, .7, .5, .3, 0], [.2, .8, .2, .8, .2], [0, .3, .5, .7, 1]])
proba = [.5, .3, .2]
n_voters = 100
n_dimensions, n_candidates = np.array(scores_matrix).shape
embeddingsGenerator = ev.EmbeddingsGeneratorPolarized(n_voters, n_dimensions, proba)
ratingsGenerator = ev.RatingsFromEmbeddingsCorrelated(0, scores_matrix, n_dimensions, n_candidates)

Then, we need to specify the level of polarisation of the profile.

A high level of polarisation (\(> 0.5\)) means that voters in the different groups are aligned with the dimension of each group. Therefore, there embeddings are really similar.

[21]:
embeddings = embeddingsGenerator(polarisation=0.7)

fig = plt.figure(figsize=(15,7.5))
embeddings.plot("3D", fig=fig, plot_position=[1,2,1], show=False)
embeddings.plot("ternary", fig=fig, plot_position=[1,2,2], show=False)
plt.show()
../_images/notebooks_profile_48_0.png

On the opposite, if the level of polarisation is low (\(< 0.5\)), then voters’ embeddings are more random.

[22]:
embeddings = embeddingsGenerator(polarisation=0.2)

fig = plt.figure(figsize=(15,7.5))
embeddings.plot("3D", fig=fig, plot_position=[1,2,1], show=False)
embeddings.plot("ternary", fig=fig, plot_position=[1,2,2], show=False)
plt.show()
../_images/notebooks_profile_50_0.png

The second important parameter is coherence.

The coherence parameter characterizes the correlation between the embeddings of the voters and the score they give to the candidates. If this parameter is set to \(1\), then the scores of a group dictate the scores of the voters in this group.

By default, it is set to \(0\), which means that the scores are totally random and there is no correlation between the embeddings and the scores.

[23]:
profile = ratingsGenerator(embeddings)
embeddings.plot_candidates(profile)
../_images/notebooks_profile_52_0.png

In the following cell, we can see that a high coherence implies that embeddings and scores are very correlated.

[24]:
ratingsGenerator = ev.RatingsFromEmbeddingsCorrelated(0.8, scores_matrix, n_dimensions, n_candidates)
profile = ratingsGenerator(embeddings)
embeddings.plot_candidates(profile)
../_images/notebooks_profile_54_0.png