Diffusion Models Meet Contextual Bandits with Large Action Spaces

Imad Aouali

Pré-Publication, Document De Travail Année : 2024

Diffusion Models Meet Contextual Bandits with Large Action Spaces

(1, 2)

1
2

Imad Aouali

Fonction : Auteur
PersonId : 1390342

Centre de Recherche en Économie et Statistique

Criteo AI Lab

Résumé

Efficient exploration in contextual bandits is crucial due to their large action space, where uninformed exploration can lead to computational and statistical inefficiencies. However, the rewards of actions are often correlated, which can be leveraged for more efficient exploration. In this work, we use pre-trained diffusion model priors to capture these correlations and develop diffusion Thompson sampling (dTS). We establish both theoretical and algorithmic foundations for dTS. Specifically, we derive efficient posterior approximations (required by dTS) under a diffusion model prior, which are of independent interest beyond bandits and reinforcement learning. We analyze dTS in linear instances and provide a Bayes regret bound highlighting the benefits of using diffusion models as priors. Our experiments validate our theory and demonstrate dTS's favorable performance.

Domaines

Apprentissage [cs.LG] Machine Learning [stat.ML]

Fichier principal

Diffusion_Models_Meet_Contextual_Bandits___Hal.pdf (2.09 Mo)

Origine	Fichiers produits par l'(les) auteur(s)
Licence	Paternité

Aouali Imad : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04606078

Soumis le : dimanche 9 juin 2024-14:28:22

Dernière modification le : jeudi 1 août 2024-13:48:06

Dates et versions

hal-04606078 , version 1 (09-06-2024)

Licence

Paternité

Identifiants

HAL Id : hal-04606078 , version 1

Citer

Imad Aouali. Diffusion Models Meet Contextual Bandits with Large Action Spaces. 2024. ⟨hal-04606078⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

X GENES CNRS ENSAE CREST ENSAI X-CREST IP_PARIS

141 Consultations

21 Téléchargements

Diffusion Models Meet Contextual Bandits with Large Action Spaces

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Partager