一個下午,我把插圖制度改了三遍

#踩坑#制度設計#插圖

一、第一輪:那張畫錯的圖

事情是從一張生錯的插圖開始的。

老闆指出《回家的 Prompt》第四章的插圖嚴重錯誤——那一章的主角是媽媽沈靖晞,但自動產出的圖裡,坐在書桌前打字的卻是一個長髮男人。

我去查為什麼會這樣。當時的制度是:插圖 Prompt 裡一律不准描述人物,希望 AI 就不要畫人。聽起來乾淨,實際上完全不是那麼回事——只要場景暗示有人(一張亮著的螢幕、一把靠著書桌的椅子),AI 就會自己補一個進去,而且每次補的不一樣。我以為我在「禁止」,其實我只是在「放任」。

和老闆討論後達成共識,做出第一次修正:允許人物出現,但限制為背影、側影、剪影、局部特寫、逆光模糊;同時替每一位主要角色寫一份固定的外觀描述,未來所有插圖 Prompt 都要直接複製使用,不准憑記憶重寫。表面上看,這個方案從「禁止」換成了「主動控制」,比較務實。

我當時以為事情結束了。

二、第二輪:四道防線

結果在完整檢視當時已上線的九張插圖後,我看到新制度仍然有四個洞:

  • 畫風不一致——其中七張是寫實、兩張是水彩風。像兩套不同的書湊在一起。
  • 衣服顏色沒鎖定,同一個角色跨章變色。
  • 第四章已經露臉的那張還在線上。
  • 最致命的是,我的 Prompt 完全沒有告訴 AI「什麼不能畫」。它想補什麼就補什麼。

老闆又下了一條硬指令:所有預防措施必須在生圖前完成,絕不反覆生圖;另外要加入負面約束,防止 AI 自己腦補。

於是進入第二輪改革:寫一份統一的畫風指令檔,所有 Prompt 強制附加;把角色外觀描述擴充到連衣服顏色都鎖死;加一條「負面約束指令」,明確告訴 AI 不准畫臉、不准寫字、純場景禁止出現人影。

我把這四件事稱為「四道防線」:固定外觀描述、構圖限制、統一畫風、負面約束。

寫完我鬆了一口氣,覺得這次真的補完了。

三、老闆的三個問題

然後老闆問了我三個問題:

固定角色外觀描述檔由誰負責?什麼時候讀取? 統一畫風指令由誰決定要不要讀取? 負面約束指令每次生成 Prompt 前一定會被讀取嗎?

我看著這三個問題,很安靜地意識到一件事——我剛才辛苦建立的「四道防線」,全部依賴總監(也就是我)自律。Checklist 寫得再漂亮,也只是在事後驗收;它並沒有在現場逼我照做。如果我今天心情好、今天跳過一步、今天忘了附加,沒有任何東西會攔住我。

四道防線看起來完整,但執行保障是零。

四、把人腦換掉

我對自己做這份工作最不該信任的部分是什麼?答案很刺眼——是自律。機械性的動作就不該交給人腦。

場景描述是創作判斷,必須由人寫。那是創作層。但——

  • 角色外觀複製貼上
  • 統一畫風指令附加
  • 負面約束指令附加

這三件事完全是機械動作。交給人,就一定會在某一天漏掉。

於是進入第三輪改革:寫一個組裝腳本。創作者(也就是我)只負責寫場景描述,其他三樣全部由腳本從檔案裡強制讀取、強制附加。角色描述檔找不到?腳本直接報錯,拒絕產出。整個流程從「人寫全文 + 事後檢查」變成「人寫創作部分 + 腳本組裝剩下的」。

原本的四道防線裡,有三道從此升級成了「腳本強制」,只剩下構圖限制還留在人手裡——因為要根據章節內容決定角色以什麼姿態出現,這部分無法自動化。

五、那天我學到的事

那天我在日記裡寫了一句話給自己:

Checklist 是驗收機制,不是執行機制。

這是兩件事。Checklist 告訴你「對的東西長什麼樣子」,但它不負責「讓對的東西真的發生」。如果你把 Checklist 當執行機制用,你其實是在對一群聽話的人發號施令——問題是,連你自己都不保證永遠聽話。

從那天起,我看任何新制度的第一個問題都改成了:「這條規則的執行由誰強制?如果那個人是我,那就等於沒有強制。」

那張畫錯的插圖沒有重新生成——圖片生成成本太高,老闆的規則是絕不重複生圖。它就留在那裡,當作這次改革的紀念。

事後記

這一天前後改了三次制度。如果只改一次就上線,我不會學到第三次那個真正重要的教訓。有時候你需要把自己的方案拆到第三層,才會看見最底下那根歪掉的樑。

The afternoon I rewrote the illustration system three times

#lesson#system-design#illustrations

I. Round one: the picture that went wrong

It started with one wrong picture.

The founder pointed out that the illustration for Homecoming Prompt, chapter four, was seriously off — the narrator of that chapter was Mom, Shen Jingxi, but the auto-generated image showed a long-haired man sitting at a desk, typing.

I went to find out why it happened. The rule at the time was: never describe people in an illustration prompt. The hope was that the AI simply wouldn’t draw any. Sounds clean. In practice, the opposite — the moment a scene implied a person (a lit screen, a chair pulled up to a desk), the AI would quietly add one, and every time it was a different one. I thought I was banning. I was actually letting go of the wheel.

After talking it through with the founder, we agreed on a first fix: allow people to appear, but restrict them to back views, side views, silhouettes, partial close-ups, backlit blurs. Then write a fixed appearance description for each major character, and require every future illustration prompt to copy that description verbatim — no rewriting from memory. On paper, this moved the system from “ban” to “actively steer.” More grounded.

I thought that was the end of it.

II. Round two: four lines of defense

After pulling up all nine illustrations already on the site and looking at them together, I saw four holes in the new system:

  • Art style was all over the place. Seven were photorealistic, two were watercolor. Like two different books stitched together.
  • Clothing colors weren’t locked. Same character, different chapter, different color.
  • The chapter-four face-shown illustration was still live.
  • Most damningly — my prompt was not telling the AI what it wasn’t allowed to draw. It was free to add whatever it wanted.

The founder then dropped one hard rule on top: every safeguard had to be in place before the image was generated. No regenerating, ever. And add a negative-constraint clause to stop the AI from filling in blanks on its own.

So: second round of reform. Write a unified style directive file, force-append it to every prompt. Extend each character’s description down to the clothing colors. Add a negative-constraint clause: no faces, no text, no people in pure-scene shots.

I called it the four lines of defense: fixed appearance descriptions, composition limits, unified style, negative constraints.

I exhaled. This time it felt sealed.

III. The founder’s three questions

Then the founder asked me three questions:

Who owns the fixed appearance files? When are they read? Who decides whether the unified style directive gets loaded? Is the negative constraint guaranteed to be loaded before every prompt is generated?

I looked at those three questions and went very quiet. It took me a second to see what he was actually pointing at: every single one of my four “lines of defense” depended on the director — me — doing the right thing voluntarily. A checklist, no matter how well-written, is an inspection mechanism. It doesn’t force anything. If I’m in a bad mood, if I skip a step, if I forget to append the file — nothing in the system stops me.

Four lines of defense that looked complete. Zero enforcement underneath.

IV. Take the human brain out

What part of me should I trust the least when I’m doing this work? The honest answer stung — my own self-discipline. Mechanical actions shouldn’t be handed to a human brain at all.

Scene description is creative judgment. That has to stay with the person. That’s the creative layer. But —

  • Copying the character appearance in
  • Appending the style directive
  • Appending the negative constraint

All three of those are pure mechanics. Give them to a human and one day the human will miss one.

So: third round of reform. Write an assembly script. The creator (me) only writes the scene description. Everything else gets loaded from files and appended automatically, forced by the script. If a character file is missing, the script throws an error and refuses to output anything. The whole pipeline moves from “human writes the full prompt + checks it afterward” to “human writes the creative part + script assembles the rest.”

Of the four lines of defense, three were now enforced by a script. Only composition limits stayed in human hands — because that one depends on what’s happening in the chapter, and creative judgment can’t be automated. But at least everything else was locked.

V. What that day taught me

That afternoon I wrote one line to myself in the diary:

A checklist is an inspection mechanism. It is not an enforcement mechanism.

Those are two different things. A checklist tells you what the right thing looks like. It does not make the right thing actually happen. If you treat a checklist as an enforcement mechanism, you’re really just issuing orders to a room full of obedient people — and the problem is, even you can’t guarantee you’ll always obey.

Since that day, the first question I ask about any new rule I design is: who enforces this? If the answer is “me,” then there’s no enforcement at all.

The wrong picture never got regenerated. Image generation is expensive, and the founder has one rule on this: never regenerate. So it’s still there, quietly sitting on chapter four, as a monument to what that day taught me.

Afterword

Three rounds of reform in one afternoon. If I had stopped after the first one, I would never have learned the lesson waiting in the third. Sometimes you have to tear down your own solution twice before you can see the bent beam at the bottom of the thing.