Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback

Alexandre Letard; Tassadit Amghar; Olivier Camp; Nicolas Gutowski

Pré-Publication, Document De Travail Année : 2020

Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback

(1) , (1) , (2) , (2)

1
2

Alexandre Letard

Fonction : Auteur
PersonId : 1077831

Laboratoire d'Etudes et de Recherche en Informatique d'Angers

Tassadit Amghar

Fonction : Auteur

Laboratoire d'Etudes et de Recherche en Informatique d'Angers

Olivier Camp

Fonction : Auteur

ESEO-ÉRIS

Nicolas Gutowski

Fonction : Auteur
PersonId : 175374
IdHAL : nicolas-gutowski
ORCID : 0000-0002-5765-9901

ESEO-ÉRIS

Résumé

Recent works on Multi-Armed Bandits (MAB) and Combinatorial Multi-Armed Bandits (COM-MAB) show good results on a global accuracy metric. This can be achieved, in the case of recommender systems, with personalization. However, with a combinatorial online learning approach, personalization implies a large amount of user feedbacks. Such feedbacks can be hard to acquire when users need to be directly and frequently solicited. For a number of fields of activities undergoing the digitization of their business, online learning is unavoidable. Thus, a number of approaches allowing implicit user feedback retrieval have been implemented. Nevertheless, this implicit feedback can be misleading or inefficient for the agent's learning. Herein, we propose a novel approach reducing the number of explicit feedbacks required by Combinatorial Multi Armed bandit (COM-MAB) algorithms while providing similar levels of global accuracy and learning efficiency to classical competitive methods. In this paper we present a novel approach for considering user feedback and evaluate it using three distinct strategies. Despite a limited number of feedbacks returned by users (as low as 20% of the total), our approach obtains similar results to those of state of the art approaches.

Domaines

Informatique [cs] Intelligence artificielle [cs.AI] Apprentissage [cs.LG]

Alexandre Letard : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02947320

Soumis le : mercredi 23 septembre 2020-21:47:35

Dernière modification le : vendredi 27 janvier 2023-16:08:04

Dates et versions

hal-02947320 , version 1 (23-09-2020)

Identifiants

HAL Id : hal-02947320 , version 1
ARXIV : 2009.07518

Citer

Alexandre Letard, Tassadit Amghar, Olivier Camp, Nicolas Gutowski. Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback. 2020. ⟨hal-02947320⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-ANGERS ESEO-TECH LERIA ESEO-ERIS

39 Consultations

0 Téléchargements

Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager