Tools and Techniques
Recommender systems
Input: Set of users + set of items + rating matrix.
Problem - given user, predict rating for an item.
In real world, recommendation matrix data is sparse.
Can use hybrid approaches.
Collaborative RS:
Knowledge-based RS:
User-based collaborative recommendation:
Item-based collaborative recommendation:
Content-based RS:
Using LOD
To mitigate lack of information/descriptions about concepts/entities.
Recommender systems are usually vertical, but LD lets you easily build a multi-domain recommender system.
To avoid noisy data, you have to filter it before feeding your RS.
Freebase.
Tiapolo
Vector space model for LOD
Recommender systems
Input: Set of users + set of items + rating matrix.
Problem - given user, predict rating for an item.
In real world, recommendation matrix data is sparse.
Can use hybrid approaches.
Collaborative RS:
- Like Amazon.
- Based on other users with similar profiles.
- Experimentally better than content-based, but you don't always have many users.
Knowledge-based RS:
- No/little user history.
- Based on domain knowledge.
User-based collaborative recommendation:
- Pearson's correlation coefficient - baseline.
- Imagine millions of users - computing similarities takes a lot of time.
- So ..
Item-based collaborative recommendation:
- Focus on items not users.
- Compute similarity between each pair of items.
- Don't have to compute similarity between items that don't have overlapping ratings.
- Cosine similarity / adjusted cosine similarity (taking into account average rating related to a user to eliminate some bias).
Content-based RS:
- Based on description of item
- and profile of user interests.
- Items are described in terms of attributes/features.
- Finite set of values associated with features.
- Item representation is a vector.
- Don't necessarily have complete descriptions of items - just have a 0 in your vector.
- Similarity between items:
- Jaccard similarity.
- Cosine similarity and TF-IDF (term frequency - inverse document frequency).
- Batch compute similarities offline, then use similarities to compute ratings on the fly based on user profile.
- Predict rate only for N nearest neighbours of items in user profile, that are not in the user profile.
- An item is worth rating if more than x of N number of neighbours are within user profile.
Using LOD
To mitigate lack of information/descriptions about concepts/entities.
Recommender systems are usually vertical, but LD lets you easily build a multi-domain recommender system.
To avoid noisy data, you have to filter it before feeding your RS.
Freebase.
Tiapolo
- Automating typing of DBPedia entities.
Vector space model for LOD
- MATHS.
Post a Comment