Scalability
Grow from a single VPS to Multi-GPU Dedicated Servers (e.g., Dual RTX 5090) as your needs evolve. Perfect for accelerating distributed AI training and massive parallel rendering tasks.
Advanced GPU VPS - RTX 5090
Enterprise GPU Dedicated Server - RTX 5090
Multi-GPU Dedicated Server- 2xRTX 5090
| Models | gemma3 | gemma3 | llama3.1 | deepseek-r1 | deepseek-r1 | qwen2.5 | qwen2.5 | qwq |
|---|---|---|---|---|---|---|---|---|
| Parameters | 12b | 27b | 8b | 14b | 32b | 14b | 32b | 32b |
| Size (GB) | 8.1 | 17 | 4.9 | 9.0 | 20 | 9.0 | 20 | 20 |
| GPU Memory | 32.8% | 82% | 82% | 66.3% | 95% | 66.5% | 95% | 94% |
| GPU UTL | 53% | 66% | 15% | 65% | 75% | 68% | 80% | 88% |
| Eval Rate(tokens/s) | 70.37 | 47.33 | 149.95 | 89.13 | 45.51 | 89.93 | 45.07 | 57.17 |
Note: The models are all from the Ollama library. For more testing data, please visit: https://www.databasemart.com/blog/ollama-gpu-benchmark-rtx5090