C
KnapsackRL
Author: mercurycontaminated-sandarac557
🎯 Optimize exploration budgets in Reinforcement Learning with KnapsackRL for more effective trajectory discovery in Large Language Models (LLMs).
Source: mercurycontaminated-sandarac557/KnapsackRL
C · Review first
Author unclaimed
Clear source
Execution · High
Audit focus · unexpected code execution