Deepseek r1 7b requirements

DeepSeek-R1: Requirements and Deployment Guide DeepSeek-R1 is a state-of-the-art reasoning model that has set new benchmarks in complex problem-solving, particularly in mathematics, science, and coding. Its performance is comparable to OpenAI's O1 model and is available under the MIT license, promoting open-source collaboration and commercial use. Model Variants and Hardware Requirements DeepSeek-R1 comes in various versions, including the full-scale models and distilled variants optimized for different hardware capabilities. Full-Scale Models: DeepSeek-R1 and DeepSeek-R1-Zero: Parameters: 71 billion VRAM Requirement: Approximately 1,342 GB Recommended Setup: Multi-GPU configuration, such as 16 NVIDIA A100 GPUs with 80GB each Distilled Models: These versions are optimized to retain significant reasoning capabilities while reducing hardware demands. Model Parameters (B) VRAM Requirement (GB) Recommended GPU DeepSeek-R1-Distill-Qwen-1.5B 1.5 ~0.7 NVIDIA RTX 3060 12GB or higher DeepSeek-R1-Distill-Qwen-7B 7 ~3.3 NVIDIA RTX 3070 8GB or higher DeepSeek-R1-Distill-Llama-8B 8 ~3.7 NVIDIA RTX 3070 8GB or higher DeepSeek-R1-Distill-Qwen-14B 14 ~6.5 NVIDIA RTX 3080 10GB or higher DeepSeek-R1-Distill-Qwen-32B 32 ~14.9 NVIDIA RTX 4090 24GB DeepSeek-R1-Distill-Llama-70B 70 ~32.7 NVIDIA RTX 4090 24GB (x2) Running DeepSeek-R1 Locally For users without access to high-end multi-GPU setups, the distilled models offer a practical alternative. These models can be run on consumer-grade hardware with varying VRAM capacities. Using Ollama: Ollama is a tool that facilitates running open-source AI models locally. Installation: Download and install Ollama from the official website. Model Deployment: Open the command prompt and execute the following command to run the 8B distilled model: ollama run deepseek-r1:8b For other model sizes, replace 8b with the desired model parameter size (e.g., 1.5b, 14b). API Interaction: Start the Ollama server: ollama serve Send requests using curl: curl -X POST http://localhost:11434/api/generate -d '{ "model": "deepseek-r1", "prompt": "Your question or prompt here" }' Replace "Your question or prompt here" with your actual input prompt. Conclusion DeepSeek-R1 offers a range of models to accommodate various hardware configurations. While the full-scale models require substantial computational resources, the distilled versions provide accessible alternatives for users with limited hardware capabilities. Tools like Ollama further simplify the process of running these models locally, enabling a broader audience to leverage advanced reasoning capabilities.

Feb 6, 2025 - 07:13

DeepSeek-R1: Requirements and Deployment Guide

DeepSeek-R1 is a state-of-the-art reasoning model that has set new benchmarks in complex problem-solving, particularly in mathematics, science, and coding. Its performance is comparable to OpenAI's O1 model and is available under the MIT license, promoting open-source collaboration and commercial use.

Model Variants and Hardware Requirements

DeepSeek-R1 comes in various versions, including the full-scale models and distilled variants optimized for different hardware capabilities.

Full-Scale Models:

DeepSeek-R1 and DeepSeek-R1-Zero:
- Parameters: 71 billion
- VRAM Requirement: Approximately 1,342 GB
- Recommended Setup: Multi-GPU configuration, such as 16 NVIDIA A100 GPUs with 80GB each

Distilled Models:

These versions are optimized to retain significant reasoning capabilities while reducing hardware demands.

Model	Parameters (B)	VRAM Requirement (GB)	Recommended GPU
DeepSeek-R1-Distill-Qwen-1.5B	1.5	~0.7	NVIDIA RTX 3060 12GB or higher
DeepSeek-R1-Distill-Qwen-7B	7	~3.3	NVIDIA RTX 3070 8GB or higher
DeepSeek-R1-Distill-Llama-8B	8	~3.7	NVIDIA RTX 3070 8GB or higher
DeepSeek-R1-Distill-Qwen-14B	14	~6.5	NVIDIA RTX 3080 10GB or higher
DeepSeek-R1-Distill-Qwen-32B	32	~14.9	NVIDIA RTX 4090 24GB
DeepSeek-R1-Distill-Llama-70B	70	~32.7	NVIDIA RTX 4090 24GB (x2)

Running DeepSeek-R1 Locally

For users without access to high-end multi-GPU setups, the distilled models offer a practical alternative. These models can be run on consumer-grade hardware with varying VRAM capacities.

Using Ollama:

Ollama is a tool that facilitates running open-source AI models locally.

Installation:
- Download and install Ollama from the official website.
Model Deployment:
- Open the command prompt and execute the following command to run the 8B distilled model:
```
 ollama run deepseek-r1:8b
```

For other model sizes, replace 8b with the desired model parameter size (e.g., 1.5b, 14b).

API Interaction:
- Start the Ollama server:
```
 ollama serve
```

Send requests using curl:

 curl -X POST http://localhost:11434/api/generate -d '{
   "model": "deepseek-r1",
   "prompt": "Your question or prompt here"
 }'

Replace "Your question or prompt here" with your actual input prompt.

Conclusion

DeepSeek-R1 offers a range of models to accommodate various hardware configurations. While the full-scale models require substantial computational resources, the distilled versions provide accessible alternatives for users with limited hardware capabilities. Tools like Ollama further simplify the process of running these models locally, enabling a broader audience to leverage advanced reasoning capabilities.