Embeddings From Ratings Correlation¶
-
class
embedded_voting.
EmbeddingsFromRatingsCorrelation
(preprocess_ratings=None, svd_factor=0.95)[source]¶ Use the correlation with each voter as the embeddings.
Morally, we have two levels of embedding.
- First, v_i = preprocess_ratings(ratings_voter_i) for each voter i, which is used as a computation step but not recorded.
- Second, M = v @ v.T, which is recorded as the final embeddings.
Other attributes are computed and recorded:
- n_sing_val: the number of relevant singular values when we compute the SVD. This is based on the Principal Component Analysis (PCA).
- ratings_means: the mean rating for each voter (without preprocessing).
- ratings_stds: the standard deviation of the ratings for each voter (without preprocessing).
Examples
>>> np.random.seed(42) >>> ratings = np.ones((5, 3)) >>> generator = EmbeddingsFromRatingsCorrelation(preprocess_ratings=normalize) >>> embeddings = generator(ratings) >>> embeddings EmbeddingsCorrelation([[1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.]]) >>> embeddings.n_sing_val 1
In fact, the typical usage is with center_and_normalize:
>>> generator = EmbeddingsFromRatingsCorrelation(preprocess_ratings=center_and_normalize) >>> embeddings = generator(ratings) >>> embeddings EmbeddingsCorrelation([[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]]) >>> embeddings.n_sing_val 0