DeepSeek-V3.2 Series

By introducing DeepSeek Sparse Attention (DSA), a scalable reinforcement learning framework, and a large-scale agentic task synthesis pipeline, DeepSeek-V3.2 (DeepSeek-AI, 2025) achieves reasoning capabilities and agent performance comparable to GPT-5. Fig. 1. Benchmark of DeepSeek-V3.2 and its counterparts. (Image source: DeepSeek-AI, 2025) DeepSeek Sparse Attention Fig. 2. Attention architecture of DeepSeek-V3.2, where DSA is instantiated under MLA. (Image source: DeepSeek-AI, 2025) ...

Created: 2025-12-31 · Updated: 2025-12-31 · 14 min · 2917 words · Yue Shui

gpt-oss & GPT-5

In August 2025, the AI field witnessed a period of intensive releases from OpenAI. Following GPT-2 (OpenAI, 2019) in 2019, OpenAI has once again contributed to the open-source community with its first open-weight large language model series, gpt-oss (OpenAI, 2025), available in 120B and 20B sizes. Shortly after, the highly anticipated next-generation flagship model, GPT-5 (OpenAI, 2025), was also officially launched. This series of releases not only marks a new high for open-source models in reasoning and agent capabilities but also reveals OpenAI’s latest advancements in model architecture, training methodologies, and safety alignment. ...

Created: 2025-08-24 · Updated: 2025-08-24 · 12 min · 2541 words · Yue Shui