Bando de Dados Avançados - Recommender Systems

Recommender Systems Collaborative Filtering & Dimensionality Reduction

Mining of Massive DatasetsJure Leskovec, Anand Rajaraman, Jeff UllmanStanford University*Adapted by Gustavo Coutinho

Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit your own needs. If you make use of a significant portion of these slides in your own lecture, please include this message, or a link to our web site: http://www.mmds.org

J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

Collaborative FilteringHarnessing quality judgments of other

users

2


Previously - Content-Based

3J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org


Utility Matrix

Users have preferences for certain items, and these preferences must be teased out of the data. Lets represent it with an Utility Matrix! Example:

6


Collaborative Filtering

Consider user x

Find set N of other users whose ratings are “similar” to x’s ratings

Estimate x’s ratings based on ratings of users in N

7

x

N



Collaborative Filtering

Different from Content-Based Filtering

We don’t need to understand the

content of an specific item!

Different user share their experiences



Let rx and rx vectors of users x and y ratings, respectively

Lets try to use the Jaccard Similarity as a measure

9

Finding “Similar” Users

rx = [*, _, _, *, ***] ry = [*, _, **, **, _]


Now, rx and ry are considered as sets

Problem: Ignores the value of the rating!

10


rx = { 1, 4, 5} ry = { 1, 3, 4}


How to put the rating factor under a formula? Cosine Similarity measure

Now, rx and ry are considered as points

Problem: Treats missing ratings as “negative”!

11


similarity = cos(Θ) =rx · ry

||rx|| · ||ry||

rx = { 1, 0, 0, 1, 3} ry = { 1, 0, 2, 2, 0}


How do we balance the missing values? Pearson correlation coefficient Sxy= items rated by both users x and y

12


sim(x, y) =

!

s∈Sxy(rxs − rx)(rys − ry)

"

!

s∈Sxy(rxs − rx)2

"

!

s∈Sxy(rys − ry)2

rx and ry = average rating of “x” and “y”


Similarity Metric

Lets consider de following Utility Matrix of users and ratings

Intuitively we want: sim(A,B)>sim(A,C) Using Jaccard: 1/5 < 2/4 Using Cosine: 0.386 > 0.322

13


Similarity Metric

Now, we’re going to use Pearson Correlation

Subtracting the (row) mean

Using Pearson: 0.092 > -0.559 Notice that Cosine Similarity is a correlation when data is centered at 0

14


Rating Predictions

How can we go from similarity metrics to recommendations? Let rx be the vector of user x’s ratings Let N be the set of k users most similar to x who have rated item i Prediction for item s of user x:

Where sxy=sim(x,y)

15

rxi =

!y∈N sxy · ryi!

y∈N sxy

rxi =1

k·

!

y∈N

ryi


Item-Item Collaborative Filtering

Until now we have used an User-User approach. What about an Item-Item? ▪ For item i, find other similar items ▪ Estimate rating for item i based on

ratings for similar items ▪ Can use the same similarity metrics and

predictions functions as in user-user model

16


12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

Users

Mov

ies

- unknown rating - rating between 1 and 5

Item-Item CF (|N|=2)

17



12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 ? 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

Users

Mov

ies

- estimate rating of movie 1 by user 5

18

J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org 19


1.00

-0.18

0.41

-0.10

-0.31

0.59

sim(1,m)12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 ? 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

Users

Mov

ies

- estimate rating of movie 1 by user 5



Neighbour selection: identify movies similar to movie 1, rated by user 5 Here we use Pearson correlation as similarity:

Subtract mean rating mi from each movie i

m1=(1+3+5+5+4)/5 = 3.6 row1:[ -2.6, 0, -0.6, 0, 0, 1.4, 0, 0, 1.4, 0, 0.4, 0]

Compute cosine similarities between rows

20



Compute similarity weights: s1,3 = 0.41, s1,6 = 0.59

1.00

-0.18

0.41

-0.10

-0.31

0.59

sim(1,m)12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 ? 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

Users

Mov

ies

21



Predict by taking weighted average

r1,5 = (0.41*2 + 0.59*3) / (0.41+0.59) = 2.6

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 2.6 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

Users

Mov

ies

22


Define similarity sij of items i and j Select k nearest neighbors N(i; x) ▪ Items most similar to i, that were rated by x Estimate rating rxi as the weighted average:

CF: Common Practice

23

baseline estimate for rxi µ = overall mean movie rating bx = rating deviation of user x

= (avg. rating of user x) – µ bi = rating deviation of movie i

∑∑

∈

∈−⋅

+=);(

);()(

xiNj ij

xiNj xjxjijxixi s

brsbr


Item-Item vs. User-User

In practice, it has been observed that item-item often works better than user-userWhy? Items are simpler, users have multiple tastes

Avatar LOTR Matrix Pirates

Alice 1 0.8

Bob 0.5 0.3

Carol 0.9 1 0.8

David 1 0.4

24


Works for any kind of item No feature selection needed

Unexpected recommendations A user may receive recommendations different from active searches done by itself

Groups with similar ratings Users may connect with each other and create groups with similar interests

Pros/Cons of Collaborative Filtering

25


Cold Start Need enough users in the system to find a match

Sparsity The user/ratings matrix is sparse Hard to find users that have rated the same items

First rater Cannot recommend an item that has not been previously rated New items, Esoteric items

Popularity bias Cannot recommend items to someone with unique taste Tends to recommend popular items

Pros/Cons of Collaborative Filtering

26


Hybrid Methods

Implement two or more different recommenders and combine predictions

Perhaps using a linear model

Add content-based methods to collaborative filtering

Item profiles for new item problem Demographics to deal with new user problem

27


Remarks & Practical Tips- Evaluation - Error metrics - Complexity / Speed

2828


Evaluation

1 3 4

3 5 5

4 5 5

3

3

2 2 2

5

2 1 1

3 3

1

Use

rs

Movies

29


Evaluation

1 3 4

3 5 5

4 5 5

3

3

2 ? ?

?

2 1 ?

3 ?

1

Use

rs

Movies

Test Data Set

30


Collaborative Filtering: Complexity

Expensive step is finding k most similar customers: O(|X|) Too expensive to do at runtime

Could pre-compute Naïve pre-computation takes time O(k·|X|)

We already know how to do this! Near-neighbor search in high dimensions (LSH) Clustering Dimensionality reduction

32


Tip: Add Data

Leverage all the data Don’t try to reduce data size in an effort to make fancy algorithms work Simple methods on large data do best

Add more data e.g., add IMDB data on genres

More data beats better algorithms http://anand.typepad.com/datawocky/2008/03/more-data-usual.html

33


Questions

34

Bando de Dados Avançados - Recommender Systems

Science

Transcript of Bando de Dados Avançados - Recommender Systems