Better Capturing Interactions between Products in Retail:
Revisited Negative Sampling for
Basket Choice Modeling
Jules Désir1, Vincent Auriau1, 2, Martin Možina3 and Emmanuel Malherbe1
1 Artefact Research Center, 2 MICS - CentraleSupélec, 3 Fortenova Grupa
In ECML-PKDD 2025.
[Full Paper] [Appendices] [Oral Presentation]
Abstract: Brick-and-mortar retailers face many different challenges that involve understanding thoroughly its products catalog and customer preferences. In particular, assortment optimization - proposing the ideal mix of products - and promotion planning hold a pivotal role in their strategy. By leveraging sales data, retailers can make informed decisions on which products to sell and how to manage inventory, based on customer preferences as well as regional and seasonal trends. It is especially crucial to capture interactions between products, in order to minimize the number of items that cannibalize each other’s sales and to ensure that complementary products, which are often purchased together, are conjointly available and sold across all stores. In this paper, we propose a model of shopping basket that learns embeddings to represent interactions between products, prices and stores. Our model is built to uncover sales patterns from very large transaction datasets. In particular, the optimization loss is computed with random negative samples in order to overcome the computational bottlenecks that arise with large number of items. Our experiments on synthetic data show the efficiency of drawing such negative samples based on the actual assortment of available products, with better results than approaches from the literature. We also validate our approach by training and evaluating our model on a dataset composed of billions of transactions from a leading European retail company. Our model showcases promising applications in the sector of retail, with enriched interfaces to efficiently support category managers.
First you can clone the repository:
git clone [email protected]:artefactory/alea-carta-est.git
To import and train the models you will need the choice-learn library. You can pip install it:
pip install choice-learn
If you want to specifically run the experiments, you can install all the dependencies at once with:
pip install -r requirements.txt
The synthetic experiments can be run using this notebook.
You can train the model on your own dataset once it is instantiated as a TripDataset
with:
from choice_learn.basket_models import AleaCarta, TripDataset, Trip
# Transform the basket data in Trips
customers_trips = []
for purchased_basket, prices, assortment in zip(*your_own_dataset):
customer_trips.append(Trip(
purchases=purchased_basket,
prices=prices,
assortment=assortment))
dataset = TripDataset(trips=customer_trips)
# Instantiate and train the model
model = AleaCarta(
latent_sizes = {"preferences": 8, "price": 4, "season": 4},
n_negative_samples=4,
lr=1e-4,
epochs=128,
batch_size=64
)
model.fit(dataset)
Get started with this notebook or directly open in Colab .
You can also check choice-learn and its documentation or contact us if you have any question.
If you find our work or any of its feature useful for your research, consider starring the repository and citing our paper:
@article{Desir2025
doi = {},
url = {},
year = {2025},
publisher = {},
volume = {},
number = {},
pages = {},
author = {Jules Désir and Vincent Auriau and Martin Možina and Emmanuel Malherbe},
title = {Better Capturing Interactions between Products in Retail: Revisited Negative Sampling for Basket Choice Modeling},
journal = {} }
If you make use of the choice-learn library you also cite us: If you consider this package or any of its feature useful for your research, consider citing our paper:

@article{Auriau2024,
doi = {10.21105/joss.06899},
url = {https://doi.org/10.21105/joss.06899},
year = {2024},
publisher = {The Open Journal},
volume = {9},
number = {101},
pages = {6899},
author = {Vincent Auriau and Ali Aouad and Antoine Désir and Emmanuel Malherbe},
title = {Choice-Learn: Large-scale choice modeling for operational contexts through the lens of machine learning},
journal = {Journal of Open Source Software} }
The use of this software is under the MIT license, with no limitation of usage, including for commercial applications.