NU: MySky: Personalisation & Recommendation Algorithms

"Bespoke information filter"

Personalisation is a very popular and ever-increasing feature within customer experiences. In order to make more meaningful connections with users, businesses are attempting to "learn" more about their customers. This allows them to target their users with offers and information that is tailored to their needs (in theory). A personalisation technique can enable a website to target advertisement, promote products, personalise news feeds, recommend documents, make appropriate advice and target via e-mail. In other words, it is possible to monitor a users behaviour and suggest relevant content as a result. With ever-increasing available data, predicting individuals' preferences and helping them locate the most relevant information has become a pressing need. Understanding and predicting preferences is also important from a fundamental point of view, as part of what has been called a “new” computational social science. In order to get a better understanding of personalisation and how this kind of data is collected and interpreted, we felt we needed to look into some of the available existing methods employed to do so.

Recommender Systems

Recommender systems are a subclass of information filtering systems that seek to predict the 'rating' or 'preference' that user would give to an item; changing the way people find products, information, and even other people. Recommender systems are a useful alternative to search algorithms since they help users discover items they might not have found by themselves. They study patterns of behaviour to know what someone will prefer from among a collection of things they have never experienced. Recommender system is an active research area in the data mining and machine learning areas. There are several different algorithms that are utilised to create recommendations for users, they include content-based filtering, user-user collaborative filtering, item-item collaborative filtering, dimensionality reduction, and interactive critique-based recommenders. We are most interested in these examples:

Content-based: The system generates recommendations from two sources: the features associated with products and the ratings that a user has given them. Content-based recommenders treat recommendation as a user-specific classification problem and learn a classifier for the user's likes and dislikes based on product features.

Collaborative: The system generates recommendations using only information about rating profiles for different users. Collaborative systems locate peer users with a rating history similar to the current user and generate recommendations using this neighbourhood.

Demographic: A demographic recommender provides recommendations based on a demographic profile of the user. Recommended products can be produced for different demographic niches, by combining the ratings of users in those niches.

Knowledge-based: A knowledge-based recommender suggests products based on inferences about a user’s needs and preferences. This knowledge will sometimes contain explicit functional knowledge about how certain product features meet user needs.

Typically, recommender systems produce a list of recommendations through collaborative or content-based filtering. Content-based filtering approaches utilise a series of discrete characteristics of an item in order to recommend additional items with similar properties. Collaborative filtering approaches building a model from a user's past behaviour (items previously purchased or selected and/or numerical ratings given to those items) as well as similar decisions made by other users; then use that model to predict items (or ratings for items) that the user may have an interest in. These approaches are often combined to form Hybrid Recommender Systems.

Content-based Filtering

In a Content-based Recommender system, keywords or attributes are used to describe items. The Content-based Filtering method analyses the content of the items at hand and a user profile is built to indicate the type of item this user likes, based on the content of the items the user likes. A user profile is built with these attributes. Direct feedback from a user, usually in the form of a like or dislike button, can be used to assign higher or lower weights on the importance of certain attributes. Items are ranked by how closely they match the user attribute profile, and the best matches are recommended. Once you have attributes for your users, and you have attributes for the thing you want to recommend, a similarity function (also called a distance function) is used to recommend similar items for that user. This recommendation (unlike collaborative filtering) does not depend on actions other users have taken. In other words, these algorithms try to recommend items that are similar to those that a user liked in the past (or is examining in the present).

A key issue with content-based filtering is whether the system is able to learn user preferences from user's actions regarding one content source and use them across other content types. When the system is limited to recommending content of the same type as the user is already using, the value from the recommendation system is significantly less than when other content types from other services can be recommended. For example, recommending news articles based on browsing of news is useful, but it's much more useful when music, videos, products, discussions etc. from different services can be recommended based on news browsing.

Collaborative Filtering

Collaborate Filtering is namely making predictions about preferences based on preferences previously expressed by users. The underlying assumption in virtually all collaborative filtering approaches is that similar people have similar “interactions” with similar items e.g. if a person A has the same opinion as a person B on an issue, A is more likely to have B's opinion on a different issue x than to have the opinion on x of a person chosen randomly. . This consideration is usually taken into account heuristically. In the context of predicting human preferences, block models assume that users and items can be simultaneously classified into categories, and that the category of the user and the category of the item fully determine the rating. This algorithm sheds light on the factors determining preferences because it allows one to study the groupings that have the most explanatory power or that accurately account for certain features of the users' ratings.

The Collaborative Filtering method makes automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating). For example, a Collaborative Filtering recommendation system for television tastes could make predictions about which television show a user should like given a partial list of that user's tastes (likes or dislikes). These predictions are specific to the user, but use information gleaned from many users.

Hybrid Recommender Systems

The term Hybrid Recommender System is used to describe any recommender system that combines multiple recommendation techniques together to produce its output. Recent research has demonstrated that a hybrid approach e.g. combining collaborative filtering and content-based filtering, could be more effective in some cases. Hybrid approaches can be implemented in several ways: by making content-based and collaborative-based predictions separately and then combining them; by adding content-based capabilities to a collaborative-based approach (and vice versa); or by unifying the approaches into one model. Several studies empirically compare the performance of the hybrid with the pure collaborative and content-based methods and demonstrate that the hybrid methods can provide more accurate recommendations than pure approaches. These methods can also be used to overcome some of the common problems in recommender systems such as cold start i.e. what to do with new users with few ratings. In order to make accurate recommendations, the system often requires a large amount of existing data on a user.

Data Mining

Data Mining is an analytic process designed to explore data in search of consistent patterns and/or systematic relationships between variables; the ultimate goal of data mining is prediction. Predictive data mining is the most common type of data mining and one that has the most direct business applications. Understanding and ultimately predicting human preferences and behaviours is important. Indeed, the digital traces that we leave with all sorts of everyday activities (shopping, communicating with others, travelling) are ushering in a new kind of computational social science which aims to shed light on human mobility, activity patterns, decision-making processes, social influence, and the impact of all these in collective human behaviour.

Traditionally, data mining techniques have been extensively employed in the area of personalisation, in particular data processing, user modeling and the classification phases. More recently the popularity of the semantic web (relating to meaning in language or logic) has posed new challenges in the area of web personalisation necessitating the need for more richer semantic based information to be utilized in all phases of the personalisation process. The use of the semantic information allows for better understanding of the information in the domain which leads to more precise definition of the user’s interests, preferences and needs, hence improving the personalisation process. Data mining algorithms are employed to extract richer semantic information from the data to be utilised in all phases of the personalisation process.

Ref: 1 2 3 4 5 6 7 8

NU: MySky

Navigation

25.1.15

Personalisation & Recommendation Algorithms