2025: A Noisy Year

A phrase circulated around the 2026 New Year, “2026 will be the summer of a century.” It resonated with me.

I grew up in Changsha and spent my undergraduate years in Beijing. On August 19, 2024, I wrapped up an internship in Beijing and moved to Hangzhou. In 2025, I lived in Hangzhou for a full year. Looking back, Hangzhou feels like an unlikely blend of Changsha’s temperament and Beijing’s pace of development. Even more coincidentally, its climate also seems like an average of the two. It is neither persistently humid like Changsha nor aggressively dry like Beijing. It does rain, but rarely in a way that makes your mood start to mildew.

For Changsha, summer means finally getting past the sticky plum rain season in spring. Clothes never fully dry, roads stay damp, and even emotions feel wrapped in moisture. For Beijing, summer is one of my favorite seasons. The daylight is long, the air is dry, and walking outside can feel like the world has been turned up in brightness. Beijing’s spring is not bad either, except for the occasional sandstorm. It reminds me of asking GPT to draft some code and getting a pile of try and except blocks inserted by default, logic that should be clean suddenly becomes bloated. The upside is that this kind of dust is usually easy to clean. The downside is that if you forget to bring laundry in from the balcony, stains remain. It is similar to copying model generated code into a repository without cleanup, it may run today, but it leaves grit everywhere for future maintenance. As for Hangzhou, aside from the lack of heating in winter, it has few climate disadvantages for me. It is not constantly rainy, and it is not dry.

The psychological impact of Changsha’s plum rain season resembles a state I experienced in research in 2025, uncertainty, confusion, and a persistent dampness. It is not catastrophic, but it is also hard to ignore. In that season, carrying an umbrella does not keep you fully dry, and not carrying one does not necessarily soak you either. That not big, not small feeling is precisely what drains you. You cannot simplify it into this is major or this is trivial, and you cannot fully pretend it is not there.

In 2025, one of my projects went through submissions from spring all the way to winter and still ended in rejection. It was not a failure of the obviously flawed kind. It was the plum rain kind of failure, you keep feeling you are just slightly short, yet you never quite get past it. The core idea was already formed by March. The implementation, in principle, was straightforward for me, but the code at the time was not written by me and I did not audit it carefully. Near the deadline, the results deviated from what I had in mind, and I submitted the paper anyway. In retrospect, it was essentially a gamble that reviewers would see the direction despite the gap in execution. The outcome was predictable. In July, I re implemented the code and resubmitted, it was rejected again. In November, we rolled it once more, and the result did not change.

Throughout that process, the question of whether to keep pushing or to stop became a concrete tension. Continuing meant investing more while already suspecting the direction might be wrong. Stopping meant accepting the loss when it still felt close. In retrospect, it left me with a simple reminder.

Of course, Changsha’s spring is not only rain. In the first half of 2025, I went to Nashville for CVPR. During the National Day holiday later that year, I traveled to Japan.

Figure 1. 2025 trip in the US and Japan.

Back to research, my subjective impression is that the technical atmosphere in 2025 was less on fire than in the previous two years. It felt more like Beijing’s spring, brief and restrained. New image generation model releases became noticeably cautious, with a modest uptick in the second half of the year, including the appearance of Z Image. Video generation, however, saw a clearer push in 2025. The open sourcing of Wan was genuinely exciting, although running it in practice also made it clear that the space is still in an early stage.

For my own trajectory, the more dramatic change was elsewhere. With the emergence of more general purpose generation and editing systems such as GPT Image and Nano Banana, the task boundary for the kind of applied personalized generation work I had focused on over the past two years was effectively redrawn.

This is not to say it lacks value. It still has product value and application value. But I have become increasingly convinced that if a direction is highly data driven and its advantages mainly come from data and scale, then in an academic setting it is easily eclipsed by stronger general purpose models. The boom around SD 1.5, SDXL, and the open source ecosystem in 2023 and 2024 created a meaningful buffer. We could produce strong work in the capability gap. But when that gap shrinks, or disappears, the contribution needs to shift from better results to clearer mechanisms, more controllable variables, and more interpretable cost and efficiency. This was the second major point I had to reflect on in 2025.

For that reason, starting in June 2025, I deliberately adjusted my research focus. I imposed stricter criteria for what is worth investing in. The first is evaluability, whether there is a widely accepted benchmark, or at least a stable and reproducible evaluation loop. The second is abstractability, whether the problem touches a more fundamental regularity, rather than relying on the temporary luck of a particular dataset or a specific model configuration.

I see this as part of training research taste. It is learning to judge what remains worth studying even as models get stronger. If my earlier weakness was being pulled too strongly by short term results, without sufficiently dissecting and abstracting the underlying principles, then the shift in the second half of 2025 was my first serious attempt to make those standards explicit and to enforce them.

A related change was that I began to actively question the assumption that training free is inherently more elegant. Improving performance by modifying the inference procedure is tempting. It requires no training, it is low cost, it iterates quickly, and it reads smoothly on paper. But unless a method is truly model agnostic, it easily becomes entangled with model specific bias. You need deep familiarity with a particular model to find structures you can exploit, yet those structures are often the model’s limitations as well. The result is that a method may appear to solve a problem while actually adapting to a specific model’s worldview. In Chinese, there is a saying, treating symptoms rather than the root cause. Increasingly, I would rather invest in regularities that do not depend on a model’s temporary advantages, principles that remain valid even as models turn over.

Original ideas matter. But for most ordinary researchers, consistently producing very strong ideas is difficult under real constraints. Compute is limited, time is limited, and cognitive bandwidth is limited. Engineering, by contrast, is one of the few advantages that can be accumulated through deliberate practice. It determines whether the problem is defined precisely, whether the evaluation loop is trustworthy, whether variables are controlled, and whether experiments are reproducible, diagnosable, and iterative. Many moments that feel like I am out of ideas are, in practice, cases where the system is too noisy and too messy to tell what is actually being solved. In 2026, I want to commit more firmly to this path. I want to make the research clearer, make the engineering cleaner, and focus on what is fundamental and likely to remain valid over time.

To make this concrete, I added a practical criterion for myself. After completing a paper, can I also produce a reusable handbook or guideline, one that clarifies the development of the related line of work, and that consolidates key settings, data processing, evaluation scripts, and major baselines into a single repository as much as possible. The point is not open sourcing for its own sake. It is building a stable, controllable, and continuously extensible evaluation instrument. Others can reproduce it, and I can test the next change under the same benchmark to verify whether it is genuine progress, or self deception.