Bowen’s random thoughts

Sprint Training, 5K Training, and Strength Training

2026-02-07T00:00:00+00:00

In real-world scenarios, endurance, power, explosive are all very important. People usually think that explosive power, such as 100m time is very hard to improve, but endurance ability and strength are much easier. However, actually the truth is that explosive power are sometimes the easiest to improve compared to endurance for long run like 5K. 5K is very hard to improve beyond some point. It requires a long of weekly mileage and supreme VO2Max and aerobic ability, which are mostly from natural talents. Take me as an example, when I increase my weekly mileage from 15 miles to 25 miles, I did not see a significant change in 5K (at that time PB was 19:37). I ran several bad races after high mileage week. However, I discovered that sprinting is just merely something that is far more technical than endurance training, it requires very careful designed postures and mechanisms. However, once you got all the things correct, you can see very fast improvement in sprinting. For me it was like in one year for 100m from 14.6s to 13.5s, and then in two month to 12.9s. You just get the right posture and consistent training, it will improve. I also see my 5K time improving even though I did not train it specifically.

This is one of the most subtle part that people miss, endurance training is actually beneficial to sprint training. Endurance training helps you recover faster, which is super important after hard sprints. However, the endurance training must be real low intensity Z2 training. No 800m pace. No VO2max. No threshold Z2 slow running will do the trick. You can add threshold, vo2max later once you are satisfied with your sprinting time, but not during your sprint trainign block.

Controllable and Constrained Sampling (CCS): Revealing Linearity in Diffusion Sampling

2025-09-18T00:00:00+00:00

✨ Introduction

Diffusion models have become the foundation of modern generative modeling, powering text-to-image systems like Stable Diffusion and DALL·E. Yet, the mechanism by which these models generate diverse yet realistic samples remains somewhat mysterious.

In our recent work, Controllable and Constrained Sampling (CCS), we uncover a surprisingly simple but powerful property underlying diffusion sampling—a widely existing linearity between the input and output during DDIM sampling.

🔍 The Hidden Linearity in DDIM Sampling

DDIM (Denoising Diffusion Implicit Models) sampling is often viewed as a nonlinear iterative denoising process. However, when we analyze the mapping between the initial noise (x_T) and the final generated sample (x_0), we find an almost linear relationship between change of outputs and change of inputs:

Empirically, this linearity is strikingly consistent across datasets, models, and sampling steps — indicating that DDIM sampling acts like a linear transformation in the high-dimensional data space.

🎛️ Controlling Samples through Linearity

This linear structure has deep implications.
Because the output (x_0) depends linearly on the input (x_T), we can directly control the generation process by perturbing the initial noise in a principled way.

Example: Controlling the Mean and MSE

Sample Mean Control:
By shifting (x_T) in a known direction, we can predictably adjust the mean of generated samples.
Sample MSE Control:
By scaling or projecting noise components, we can constrain the output variance or reconstruction error—without retraining the model.

In effect, CCS turns a black-box sampler into a controllable generator with explicit, mathematically interpretable levers.

🧭 Linearity as a Window into Data Distribution

Beyond controllability, linearity reveals something fundamental about the model’s understanding of the data.

When we test the same sampling process on out-of-distribution (OOD) data, the linear correlation between (x_T) and (x_0) sharply deteriorates.

This suggests that:

High linearity corresponds to data within the training distribution.
Low linearity flags OOD or underrepresented regions.

Thus, linearity becomes not just a control signal—but a diagnostic tool for understanding the geometry of the learned data manifold.

🌌 Why CCS Matters

Training-free: CCS requires no additional model training or architectural changes.
Generalizable: Works across diverse diffusion backbones (e.g., Stable Diffusion, DiT, Video Diffusion).
Interpretable: Provides a physics-like view of how information flows through the generative process.
Diagnostic: Offers new metrics for measuring model robustness and data coverage.

For details check our paper at: https://arxiv.org/abs/2502.04670

Concept	Description
Observation	DDIM sampling exhibits near-linear mapping from initial noise to output.
Implication	Enables direct control over sample mean, MSE, and other statistics.
Insight	Linearity reflects model’s internal understanding of the data distribution—lower for OOD data.
Contribution	CCS provides a unified, training-free framework for controllable and constrained sampling.

A bug when installing new environment in Conda

2023-08-15T00:00:00+00:00

(base) atcold@AlfMAC3 ~ $ which python /usr/local/bin/python # system python (base) atcold@AlfMAC3 ~ $ conda activate PPUU (PPUU) atcold@AlfMAC3 ~ $ which python /usr/local/bin/python # still system python! Workaround: run conda deactivate.

Launching new shell

(base) atcold@AlfMAC3 ~ $ conda deactivate atcold@AlfMAC3 ~ $ conda activate PPUU (PPUU) atcold@AlfMAC3 ~ $ which python /Users/atcold/opt/miniconda3/envs/PPUU/bin/python # virtual environment python (PPUU) atcold@AlfMAC3 ~ Seems like (base) needs to be deactivated first.

Road to Recovery - ACL完全断裂半月板复杂裂1.5年笔记

2023-07-24T00:00:00+00:00

滑雪事故（2021/12/12）

2021年12月12日，我在 Mammoth Mountain 滑雪，气温在零下10到15度之间，中午前风力极大，缆车摇晃到45度，能见度很差，雪道冰硬。我原本计划挑战 Chair 23 的 Dropout Chute 和 Wipeout Chute。早上先在 Gravy Chute 滑了两趟热身，状态极佳，决定不再使用 jump turn，直接滑下。

起初一切顺利，几次转弯都很流畅，心里甚至想着：这双黑道也不过如此。正得意时，突然一个转弯幅度过大，瞬间失去平衡，整个人头朝下开始翻滚。42度的坡度和重力让我完全无法控制速度，向山脚高速翻滚，整整滚了约200米才神奇地停下来。检查后发现只掉了一块雪板，心想算是幸运，打算继续滑下去，但第一个弯就发现左膝完全无法发力，像是失去了知觉，再次摔倒又滚了约100米。这时明显感觉腿无法伸直，勉强可以走几步但疼痛明显。路人见状叫来了救援，我被雪橇送下山。随身物品几乎都在，只有运动相机永远留在了雪山一角。

⸻

受伤初期（2021/12 - 2022/1）

X光显示胫骨平台有撕脱性骨折，医生怀疑ACL断裂。后续MRI证实最坏的结果：ACL完全断裂，半月板复杂撕裂，还有提篮样碎片移位卡住关节。这通常意味着剧烈运动要告别了。那时腿无法伸直，日常全靠拐杖，出门极其不便，尤其痛恨家里的浴缸设计——对于行动不便者非常不友好。

1月初去看主刀医生时，膝盖只能弯90度且无法伸直。他建议“两期手术”：先恢复活动度并修复半月板，几个月后再重建ACL。我直接决定尽快动手术，虽然许多人会去多问几个医生，但我当时内心相信这位医生的判断。

⸻

第一次手术：半月板修复

术前医生估计半月板可能需要切除。麻醉前，我问了最后一个问题：“我还能滑雪吗？”医生回答：“Absolutely you can.” 睡醒后护士告诉我：“有个好消息，医生成功修复了半月板！”这是远低于50%概率的好消息。手术报告显示半月板缝了七针，软骨保存情况良好。

⸻

手术初期恢复（2022/1 - 2022/3）

半月板修复恢复期长，术后一个月内几乎不能负重。我采取极其保守的恢复策略：长时间抬腿冰敷、肌肉收缩训练（如股四头肌等），初期不能做直腿抬高练习以避免髌骨损伤。前三周膝盖弯曲角度不能超90度，只能练习脚踝活动和股四头肌收缩，同时逐步恢复弯腿活动度。每天冰敷数次，下床需戴好ACL支具。

我每周安排两次复健，复健师非常专业，首节课就帮我提升了膝盖弯曲角度10度。术后第三周，我逐渐尝试双脚负重站立。到了2月初，已经可以坐在桌前较长时间，走路时使用单拐。2月底，正式脱拐！

⸻

手术中期恢复（2022/3 - 2022/5）

脱拐后重心转为力量训练，尤其是股四头肌。我从椅子辅助半蹲开始，逐渐加入直腿抬高、侧抬腿、平板支撑等。复健中，我做平板撑3分钟被提醒“够了”，转而尝试 dead bug、side plank 等高效动作。尽管走路时膝盖偶有不稳感，主要原因是ACL仍然断裂，左右腿力量差距巨大——左腿只能负重40磅，右腿约200磅。

⸻

第二次手术：ACL重建（2022/5）

4月底检查确认可以手术，医生建议使用BTB（髌腱）移植物进行ACL重建。手术顺利完成，醒来后甚至和协作者讨论起科研话题。医生术后打来电话，确认半月板愈合良好，这是受伤以来最好的消息。由于采用了分期手术方案，恢复负重速度更快。

⸻

ACL术后初期恢复（2022/5 - 2022/7）

术后肿胀明显，疼痛强于第一次手术，但第三天就明显缓解。第六天开始复健，膝盖弯曲到105度。我持续以保守策略进行锻炼，重点加强股四头肌与小腿肌群，开始骑动感单车，逐渐增加单腿练习强度。术后六周，膝盖活动度回到130度。

⸻

中期恢复（2022/7 - 2022/10）

恢复正常走路花了很久，直到10月底才走得像正常人。复健师指出我左腿走路姿势僵硬，并制定了分阶段练习计划（抬腿、轻弯着地、蹬地）。后续发现左臀紧张是问题根源，练习放松臀部后改善显著。

9月搬到西雅图后，换了更偏向“运动能力”的复健中心，项目多且强度大。此时我已能进行登山等运动。但仍存在膝盖弯曲不足的问题，复健师用“拔火罐疗法”试图松解粘连，效果略有改善。

⸻

突发状况（2022/10/18）

一次散步下坡时膝盖“啪”地一声，瞬间变得异常灵活，像失去了拉力，我极度恐慌是否是ACL断裂。复健师检查认为稳定，医生也表示无碍，MRI后确认韧带完好，可能是粘连组织被松解，反而带来了积极变化。通过恢复期静养和走路姿态调整，弹响明显减少，活动度提升。

⸻

后期恢复与回归（2022/12 - 现在）

之后恢复平稳，逐步开始跳跃训练，先双脚跳再到单脚跳。2023年3月开始跑步训练，发现左腿负重仍然不足，进一步强化训练后情况明显改善。我还引入了硬拉和核心训练，加强腘绳肌与核心肌群。

2023年6月起正式开展系统跑步训练，采用间歇+轻松跑结合策略。研究表明跑得快反而对膝盖冲击更小，因此我坚持高速度短时间的训练法。7月23日，我顺利完成了旧金山马拉松的5K比赛，总排名21，年龄组排名第4，且赛后膝盖完全无不适。可以说，已经从ACL伤病中“毕业”，尽管仍有不足，但我会继续努力、不断进步。

⸻ 2025/4/21 5K 比赛中跑出19:19, 向sub-18努力

Domain Adaptation with Latent Diffusion Models for Segmentation and Classification

2023-07-15T00:00:00+00:00

In-Context Learning Unlocked for Diffusion Models: https://zhendong-wang.github.io/prompt-diffusion.github.io/, in-context visual/text prompts

Fast Adaptation with in-context learning to new inverse problems

Long Video Generation with Latent Diffusion Models via AutoPrompting

3D Latent Diffusion

NeRF type methods

3D Neural Field Generation using Triplane Diffusion (code available): Triplane Diffusion
Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction (code available): Diffusion-NeRF
One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization (code pending): Single Image

Video Diffusion

Latent Video Diffusion Models for High-Fidelity Long Video Generation Video Diffusion
MagicVideo: Efficient Video Generation With Latent Diffusion Models Video Diffusion 2
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models Latent Video Diffusion Latent CVPR (combining a SR DM and LDM) This is can used together with a 2D image latent diffusion model (e.g. stable diffusion), The key idea is Adjust the latent vector by a 3D temporal network Training
1. first use latent 2D diffusion encoder to obtain a code (spatial step) for
2. then use temporal network to adjust the latent code, and combine with the original code (a convex combination)
3. Do this for a couple of times Inference.
Video Probabilistic Diffusion Models in Projected Latent Space
1. Use the triplane idea (xy, xz, yz) latent codes (using 2D diffusion instead of 3D diffusion)
2. First use video transformer to compress video C X H X W -> C X H’ X W’
3. Then use three small transformers to project 3D into 2D i.e. $z_h = f_{\theta}(u_h)$, $z_w = f_{\theta}(u_w)$, $z_c = f_{\theta}(u_c)$ reducing space complexity to $O(HWC)$ to $O(HW) + O(CW) + O(HC)$

Long Video Generation

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation (Coarse to Fine model). Coarse diffusion with a fine diffusion: https://arxiv.org/pdf/2303.12346.pdf
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation, https://arxiv.org/pdf/2303.08320.pdf, generate residual due to highly correlated frames
Flexible Diffusion Modeling of Long Videos: https://arxiv.org/pdf/2205.11495.pdf, conditional generation
Video Diffusion Models https://arxiv.org/abs/2212.00235, 3D UNet
VIDM: Video Implicit Diffusion Models https://arxiv.org/pdf/2212.00235.pdf combining a motion generator and a content generator. with normalization (INR like)
Video Diffusion Models with Local-Global Context Guidance https://arxiv.org/pdf/2306.02562.pdf Global context and local context
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation https://arxiv.org/pdf/2307.06940.pdf

Visual AutoPrompting

ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image Translation https://arxiv.org/pdf/2305.04651.pdf, use GPT3 to change prompt and edit image
Negative-prompt Inversion: Fast Image Inversion for Editing with Text-guided Diffusion Models image editting without optimization. https://arxiv.org/pdf/2305.16807.pdf
Visual Instruction Inversion: Image Editing via Visual Prompting https://arxiv.org/pdf/2307.14331.pdf
Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models https://arxiv.org/abs/2209.06970
Test-time Adaptation
Prompting Diffusion Representations for Cross-Domain Semantic Segmentation: https://arxiv.org/pdf/2307.02138.pdf use prompt to improve generalization ability of diffusion models

Inverse Problem

Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision https://arxiv.org/pdf/2306.11719.pdf
Other Related Works
MAGVIT: Masked Generative Video Transformer: https://arxiv.org/pdf/2212.05199.pdf
MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation https://arxiv.org/pdf/2205.09853.pdf masked methods like MAE etc
Diffusion Models as Masked Autoencoders https://arxiv.org/abs/2304.03283
DIFFUSION MODELS ALREADY HAVE A SEMANTIC LATENT SPACE https://arxiv.org/pdf/2210.10960.pdf
ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models https://arxiv.org/pdf/2302.07257.pdf
Visual Instruction Tuning: https://arxiv.org/pdf/2304.08485.pdf
Adversarial Discriminative Domain Adaptation https://arxiv.org/pdf/1702.05464.pdf
Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Imaging Inverse Problems https://arxiv.org/pdf/2308.14409.pdf

sketch to image. VAE -> shared feature with downgraded image. (maybe try latent diffusion embedding) lora, shared feature embedding/text, change model itself, controlnet

VPDM Architecture

Triplane Representation of Knee MRI image

Architecture of Diffusion NeRF

Results for Sparse-View Reconstruction

This is a header

Some T-SQL Code

SELECT This, [Is], A, Code, Block -- Using SSMS style syntax highlighting
    , REVERSE('abc')
FROM dbo.SomeTable s
    CROSS JOIN dbo.OtherTable o;

Some PowerShell Code

Write-Host "This is a powershell Code block";

# There are many other languages you can use, but the style has to be loaded first

ForEach ($thing in $things) {
    Write-Output "It highlights it using the GitHub style"
}