Bertrand, Maïté
[UCL]
Saerens, Marco
[UCL]
The market basket analysis is a technique used to get a first look at the links between several variables and to identify the co-occurring items in consumers baskets. It is very useful for companies. Indeed it can be used as a basis for marketing decisions such as promotional support, inventory control and cross-sell campaigns. To discover non apparent affinities between products is often considered as a challenge for retailers. In this master thesis, we analyzed a retail case for the chocolate market in the country X, in order to do that, we applied two methods. The first one was the association rules, that aims to answer the question: "How can we describe in an exploratory way the chocolate market in the country X?" To address this general issue we decided to answer 4 research questions: Which products a certain kind of store can sell the most?, Which products are more likely bought in a higher quantity?, Which products a manufacturer is more likely to sell? And finally, during which period a specific product is more likely bought? To keep the best rules we had to choose some thresholds for support, confidence and lift. However, the different criteria on which the analysis is based can be in competition and we had to choose a compromise between these criteria in order to find some valuable rules. We found via the association rules that some products are more likely bought in a special periods of the year while others are sold all along the year. We have also been able to observe that some products have a tendency to be purchased in a larger quantity but the median quantity bought is 1 unit per transaction. Moreover, we observed that some retailers sell more easily some products of certain specific manufacturers and specific price ranges than others (discounters will more likely sell the low price range of products category in general). The assumptions that we made for these research questions were not all verified, such as expensive products are bought in small quantities; cheaper and medium price ranges products are bought in a larger quantity and Pralines are more likely bought at the end of the year. The second method that we realised was the collaborative recommendation. This aims to answer the query "Which algorithm for the recommendation is the best for retailer’s data?" In order to answer this, we compared several metrics to choose the nearest neighbours. We tried 5 different algorithms: adjusted cosine, cosine similarity, Euclidean distance, Tanimoto index and "a/(a+b+c)". Then we evaluated the potential to use such a system, knowing that the recommendations still have to be evaluated by experts in the field. After the comparison, we can say that there is not a best model for the recommendation but there is a best compromise for that according to three criteria: running time, mean of accuracy in term of precision and the stability ofthis prediction. We find that the a/(a+b+c) was the best compromise. Indeed, it’s the fastest algorithm to provide a recommendation (84 hours). Moreover even if it is not the most accurate one, it is very close to the best algorithm for this criterion (Tanimoto index). On the other hand, it is true that on the stability criteria, this algorithm is one of the worst compared to the others that we tested but it keeps a good stability. Since the standard deviation of its precision’s mean is less than 1%, which is good.


Bibliographic reference |
Bertrand, Maïté. Analysing and predicting associations: a retail case study. Louvain School of Management, Université catholique de Louvain, 2017. Prom. : Saerens, Marco. |
Permanent URL |
http://hdl.handle.net/2078.1/thesis:10293 |