||
Inception Labs(官网地址:https://www.inceptionlabs.ai/)宣布发布了第一个商业级扩散大语言模型(Introducing Mercury, the first commercial-scale diffusion large language model),一个字:
快!
非常快!
官网介绍是:
Our Vision — Next Generation LLMs Powered By Diffusion
Current large language models are autoregressive, meaning that they generate text left to right, one token at a time. Generation is inherently sequential—a token cannot be generated until all the text that comes before it has been generated—and generating each token requires evaluating a neural network with billions of parameters. Frontier LLM companies are betting on test-time computation to increase reasoning and error-correction capabilities, but generating long reasoning traces comes at the price of ballooning inference costs and unusable latency. A paradigm shift is needed to make high-quality AI solutions truly accessible.
Diffusion models provide such a paradigm shift. These models operate with a “coarse-to-fine” generation process, where the output is refined from pure noise over a few “denoising” steps, as illustrated in the video above.
Because diffusion models are not restricted to only considering previous output, they are better at reasoning and at structuring their responses. And because diffusion models can continually refine their outputs, they can correct mistakes and hallucinations. For these reasons, diffusion powers all of the most prominent AI solutions for video, image, and audio generation, including Sora, Midjourney, and Riffusion. However, applications of diffusion to discrete data such as text and code have never been successful. Until now.
AI的迭代演进很快。
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2025-3-15 23:07
Powered by ScienceNet.cn
Copyright © 2007-2025 中国科学报社