Strategy keynote · Decksmith experiment

Reliability, before autonomy.

先让本地 agent 系统少失真，再让它跑得更远。

Priority

P0

Evidence first.
Automation second.

Decision

Do less.
Verify more.

P0

Evidence trace

复杂研究和机制修改，必须留下证据链。

P1

Sub-agent receipts

子 agent 交付来源、置信度和下一步核验。

Later

Memory governance

等规模真正上来，再做系统化治理。

Problem

The agent doesn’t fail loudly.

它更常见的失败方式，是把低可信判断包装成顺滑结论。

01

来源缺失

最后只剩摘要，无法回到原始证据。

02

推断混入事实

看起来合理，但来源和模型判断没有分开。

03

低可信扩散

进入 memory 或 skill 后，错误会被复用。

P0

Make conclusions traceable.

只在高风险任务启用：深度研究、sub-agent 综合、agent / skill / memory 机制变更。

A

Sources checked

链接、文件、命令输出，留到结论旁边。

B

Confirmed vs inferred

事实和判断分层，不混写。

C

Unverified claims

不确定项显式留下，不进长期系统。

P1

Sub-agents need receipts.

Input

Task boundary

明确要查什么，不让子 agent 泛泛总结。

→

Output

Evidence + confidence

每条发现带来源和置信度。

→

Main thread

Verify before use

主线程核验后，才进入结论。

Later

Don’t govern memory too early.

现在继续用 Cloudflare AI Search + 人工判断。等召回质量变乱，或规模接近阈值，再加治理层。

Trigger

1000+

memories before systematic governance

Already fixed

Signal routing is now personal.

Public

What happened today?

全网热点，保留公共视野。

Personal

What should we use?

按 X 书签画像，优先 agent / memory / skill / workflow。

Operating principle

Build the guardrails that make future speed safe.

先治理失真，再扩展能力。

Evidence trace

Receipt handoff

Governance later

Reliability, before autonomy.

Do less.Verify more.

Evidence trace

Sub-agent receipts

Memory governance

The agent doesn’t fail loudly.

来源缺失

推断混入事实

低可信扩散

Make conclusions traceable.

Sources checked

Confirmed vs inferred

Unverified claims

Sub-agents need receipts.

Task boundary

Evidence + confidence

Verify before use

Don’t govern memory too early.

Signal routing is now personal.

What happened today?

What should we use?

Build the guardrails that make future speed safe.

Do less.
Verify more.