DeepSeek: DeepSeek V3.1 Base
This is a base model, trained only for raw next-token prediction. Unlike instruct/chat models, it has not been fine-tuned to follow user instructions. Prompts need to be written more like training text or examples rather than simple requests (e.g., “Translate the following sentence…” instead of just “Translate this”). DeepSeek-V3.1 Base is a 671B parameter open Mixture-of-Experts (MoE) language model with 37B active parameters per forward pass and a context length of 128K tokens. Trained on 14.8T tokens using FP8 mixed precision, it achieves high training efficiency and stability, with strong performance across language, reasoning, math, and coding tasks.
Parameters
-
Context Window
163,840
tokens
Input Price
$0.25
per 1M tokens
Output Price
$1
per 1M tokens
Capabilities
Model capabilities and supported modalities
Performance
Excellent reasoning capabilities with strong logical analysis
Strong mathematical capabilities, handles complex calculations well
-
-
Modalities
text
text
LLM Price Calculator
Calculate the cost of using this model
Monthly Cost Estimator
Based on different usage levels