Results
Evaluation Results:
- 14 Artifact Available, Functional, and Results Reproduced
- 7 Artifact Available and Functional
- 0 Artifact Functional and Results Reproduced
| Title | Available | Functional | Reproduced | Available at |
|---|---|---|---|---|
| AI Realtor: Towards Grounded Persuasive Language Generation for Automated Copywriting |
|
|
Artifact |
|
| Robust Batch-Level Query Routing for Large Language Models under Cost and Capacity Constraints |
|
|
Artifact |
|
| CAMI: Cost-Aware Agent-Guided Multi-Indexing for Semantic Retrieval |
|
|
Artifact |
|
| Glia: A Human-Inspired AI for Automated Systems Design and Optimization |
|
|
Artifact |
|
| OpaqueToolsBench: Learning Nuances of Tool Behavior Through Interaction |
|
|
Artifact |
|
| Robust Agent Compensation (RAC): Teaching AI Agents to Compensate |
|
|
Artifact |
|
| ViBench: A Benchmark on Vibe Coding |
|
|
Artifact |
|
| AgentStop: Terminating Local AI Agents Early to Save Energy in Consumer Devices |
|
|
|
Artifact |
| Context, Reasoning, and Hierarchy: A Cost–Performance Study of Compound LLM Agent Design in an Adversarial POMDP |
|
|
|
Artifact |
| Do Agents Need to Plan Step-by-Step? Rethinking Planning Horizon in Data-Centric Tool Calling |
|
|
|
Artifact |
| Exploring and Developing a Pre-Model Safeguard with Draft Models |
|
|
|
Artifact |
| FORGE: Self-Evolving Agent Memory With No Weight Updates via Population Broadcast |
|
|
|
Artifact |
| How To Steer Your Multi-Agent System: Human-LLM Collaborative Planning |
|
|
|
Artifact |
| Improving Coherence and Persistence in Agentic AI for System Optimization |
|
|
|
Artifact |
| Learning from Supervision with Semantic and Episodic Memory: A Reflective Approach to Agent Adaptation |
|
|
|
Artifact |
| optimize_anything: Unified Text Optimization can Outperform Specialized Systems |
|
|
|
Artifact |
| Retrieval-Augmented LLMs for Security Incident Analysis |
|
|
|
Artifact |
| Securing Agents With Tracked Capabilities |
|
|
|
Artifact |
| Securing the Agent: Vendor-Neutral, Multitenant Enterprise Retrieval and Tool Use |
|
|
|
Artifact |
| Vista: Verifier-in-the-Loop Agentic Reinforcement Learning for Quantum Program Synthesis |
|
|
|
Artifact |
| Who Decides the Trade-off? Resolution Policy as Delegation Governance in Autonomous Agents |
|
|
|
Artifact |