


default search action
"DUO: Diverse, Uncertain, On-Policy Query Generation and Selection for ..."
Xuening Feng et al. (2025)
- Xuening Feng, Zhaohui Jiang, Timo Kaufmann, Puchen Xu, Eyke Hüllermeier, Paul Weng, Yifei Zhu:
DUO: Diverse, Uncertain, On-Policy Query Generation and Selection for Reinforcement Learning from Human Feedback. AAAI 2025: 16604-16612

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.