On-line Coverage Studying and Inference by Matrix Completion

0
8



arXiv:2404.17398v1 Announce Sort: new
Summary: Making on-line choices may be difficult when options are sparse and orthogonal to historic ones, particularly when the optimum coverage is realized by way of collaborative filtering. We formulate the issue as a matrix completion bandit (MCB), the place the anticipated reward below every arm is characterised by an unknown low-rank matrix. The $epsilon$-greedy bandit and the net gradient descent algorithm are explored. Coverage studying and remorse efficiency are studied below a particular schedule for exploration chances and step sizes. A sooner decaying exploration likelihood yields smaller remorse however learns the optimum coverage much less precisely. We examine a web based debiasing methodology based mostly on inverse propensity weighting (IPW) and a normal framework for on-line coverage inference. The IPW-based estimators are asymptotically regular below gentle arm-optimality situations. Numerical simulations corroborate our theoretical findings. Our strategies are utilized to the San Francisco parking pricing challenge information, revealing intriguing discoveries and outperforming the benchmark coverage.



Supply hyperlink

LEAVE A REPLY

Please enter your comment!
Please enter your name here