Diffusion扩散模型调研

Last updated on March 16, 2026

在去年的TRECVID比赛中，我们使用的是2023年的stable-diffusion-v1-5。在实验中，感觉图片质量并不高，对于人物形状的处理还有待提升。于是萌发了调研最近新的可用模型的想法，供之后的实验使用。

本次调研的目的还是落在方便调用和生成质量好这两点上，主要偏向应用而非学术。

示例

We see a girl in a dark dress pushing the door of a convenience store, after it closes, she runs away. There are two bikes and four trash cans in front of the shop windows. The store's brand colors are green, white and  blue.

GT：

Model	stable-diffusion-v1-5
image

This line appears after every note.

Notes mentioning this note

Projects

0.百科全书 [[github问题]] 2024.10.08 [[笔记本电脑]] [[华为手机安装google框架]] [[科研问题]] [[github问题]] [[huggingface]] [[linux]] [[Python使用]] [[Vscode使用]] [[港科广二期HPC使用]] 2025.07.25 [[顶会论文及检索网址]] 2025.10.10 1.前后端 [[使用Flask快速构建浏览器实现图片交互]]

Here are all the notes in this garden, along with their links, visualized as a graph.