DeepSeek has released its new open-source model V3.1, with context length expanded to 128K
DeepSeek has announced the open-source release of its new base model, DeepSeek-V3.1-Base. After being published on Hugging Face, the model quickly rose to the 4th spot on the trending models list. DeepSeek-V3.1-Base adopts a Mixture of Experts (MoE) architecture, with its context length expanded to 128K, while maintaining the same number of parameters as the V3 version.
© Copyright Notice
The copyright of the article belongs to the author. Please do not reprint without permission.
Related Posts
No comments yet...