[2302.06595] When Can We Observe Important Choice Shifts in Dueling Bandits?


Obtain a PDF of the paper titled When Can We Observe Important Choice Shifts in Dueling Bandits?, by Joe Suk and Arpit Agarwal

Obtain PDF

Summary:The $Ok$-armed dueling bandits drawback, the place the suggestions is within the type of noisy pairwise preferences, has been broadly studied due its functions in info retrieval, advice methods, and so forth. Motivated by considerations that person preferences/tastes can evolve over time, we contemplate the issue of dueling bandits with distribution shifts. Particularly, we research the latest notion of great shifts (Suk and Kpotufe, 2022), and ask whether or not one can design an adaptive algorithm for the dueling drawback with $O(sqrt{Ktilde{L}T})$ dynamic remorse, the place $tilde{L}$ is the (unknown) variety of vital shifts in preferences. We present that the reply to this query depends upon the properties of underlying desire distributions.

Firstly, we give an impossibility consequence that guidelines out any algorithm with $O(sqrt{Ktilde{L}T})$ dynamic remorse below the well-studied Condorcet and SST lessons of desire distributions. Secondly, we present that $textual content{SST} cap textual content{STI}$ is the most important amongst in style lessons of desire distributions the place it’s doable to design such an algorithm. General, our outcomes supplies an nearly full decision of the above query for the hierarchy of distribution lessons.

Submission historical past

From: Joe Suk [view email]
Mon, 13 Feb 2023 18:49:50 UTC (81 KB)
Wed, 24 Jan 2024 21:50:39 UTC (534 KB)

Supply hyperlink


Please enter your comment!
Please enter your name here