arxiv:2509.25455
Konstantin Grotov
konstantgr
AI & ML interests
None yet
Recent Activity
updated
a model
about 1 month ago
JetBrains-Research/Qwen3-30B-A3B-am
upvoted
a
paper
about 1 month ago
The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N
Sampling via max@k Optimisation
authored
a paper
2 months ago
PIPer: On-Device Environment Setup via Online Reinforcement Learning