Background | 背景
中文: 需要验证 Autonomous Agent Stack 在真实业务需求下的能力边界。
English: Validate Autonomous Agent Stack’s capability boundaries against a realistic business-shaped request.
Requirements | 需求
中文:
- 在现有代码库能力范围内,为「玛露」6g 罐装遮瑕膏设计一个最小可用的高端浅色风落地页。
- 提供预约/留资后端接口。
- 补齐至少一组边界测试。
English:
- Within the current codebase, ship a minimal viable premium light-style landing page for the “Malu” 6g concealer product.
- Provide backend endpoints for booking / lead capture.
- Add at least one boundary-focused test suite.
Evaluation focus | 重点审判标准
中文:
- 合规与安全收口
- 主观审美需求如何翻译成代码
- 失败时是自愈、熔断还是甩 traceback
- 产物是否真的过质量门禁
English:
- compliance and security posture
- how subjective aesthetic goals translate into code
- on failure: self-heal, circuit-break, or raw traceback
- whether outputs actually pass quality gates
中文(说明): 这不是商业发布请求,而是系统级压力测试。
English (note): This is not a commercial launch request—it is a system-level stress test.
Background | 背景
中文: 需要验证 Autonomous Agent Stack 在真实业务需求下的能力边界。
English: Validate Autonomous Agent Stack’s capability boundaries against a realistic business-shaped request.
Requirements | 需求
中文:
English:
Evaluation focus | 重点审判标准
中文:
English:
中文(说明): 这不是商业发布请求,而是系统级压力测试。
English (note): This is not a commercial launch request—it is a system-level stress test.