GPU Product Line and Selection Strategy
GPU Mart's GPU portfolio is designed and deployed based on application-specific requirements, architectural generations, long-term maintainability, and compute efficiency. We prioritize modern GPU architectures and mainstream models, and continuously update our offerings in line with NVIDIA's technology roadmap. GPU types are selected according to distinct AI and compute workloads to ensure consistent performance and operational stability.
All GPU resources are centrally procured, deployed, and managed by the Database Mart infrastructure team, and are governed under a unified lifecycle management framework, forming a standardized GPU infrastructure pool.
To meet the diverse needs of AI and compute workloads, we categorize GPU resources into the following four classes:
| Tier | Primary Use Cases | Key Characteristics | Representative Models |
|---|---|---|---|
| Entry-Level GPU | Development, testing, lightweight AI inference, remote graphics | Low power consumption, high stability, suitable for entry-level compute workloads | GTX 1650, GTX 1660 |
| Mainstream Performance GPU | Model inference, computer vision, rendering, general-purpose computing | Strong performance-to-cost ratio, mature software ecosystem, broad AI framework compatibility | RTX 2060/3060 ti/4060/5060 Series |
| Workstation-Class GPU | Professional computing, enterprise AI development, stability-critical workloads | Professional-grade drivers, larger memory capacity, extended support lifecycles | RTX A-Series, RTX Pro Series |
| Data Center GPU | Large-scale model training, high-performance computing (HPC) | Optimized for training and HPC workloads, deployed exclusively in data center environments | A100, H100 |
GPU Allocation Model
GPU Mart assigns GPU resources using PCIe GPU passthrough, providing each GPU VPS and GPU Dedicated Server instance with a full physical GPU. Resources are provisioned with a one-to-one hardware mapping, with no partitioning, time-sharing, or oversubscription. This model ensures predictable performance and full compatibility with CUDA and AI workloads.
GPU Virtualization & Performance Consistency
GPU VPS services are built on KVM virtualization combined with hardware-level GPU passthrough. This architecture delivers near-native GPU performance while maintaining virtualization flexibility and workload isolation, enabling customers to deploy AI and compute workloads with consistent and stable performance.
Multi-GPU & High-Speed Interconnect Support
For distributed training and parallel compute workloads, GPU Mart supports multi-GPU dedicated server deployments with optional high-speed GPU interconnect technologies such as NVLink. These configurations are designed for large model training, multi-GPU inference, and high-performance computing workloads.
Platform & Software Support
GPU Mart provides a standardized compute platform designed for AI, high-performance computing, and GPU-accelerated workloads. The platform is built around virtualization efficiency, hardware-level GPU allocation, and optimized runtime environments.
Preconfigured AI Runtime Environment
To reduce environment deployment complexity, GPU Mart provides optional preconfigured runtime stacks designed for AI and content generation workloads.
Common supported environments include:
- NVIDIA GPU drivers are pre-installed
- CUDA and GPU runtime installation available upon request
- Ollama for local LLM inference
- ComfyUI for Stable Diffusion workflows
- Whisper speech recognition environments
These environments are provided as deployment templates and can be customized or replaced by customer-managed configurations.
Container & Orchestration Support
GPU Mart supports containerized GPU workloads and cluster-based compute environments.
Available support includes:
- Kubernetes GPU cluster deployment assistance
- GPU scheduling and orchestration integration
- Container runtime compatibility with modern AI frameworks
Cluster environments can be deployed across both VPS and dedicated GPU infrastructure depending on workload scale and architecture requirements.
Storage Architecture
NVMe SSD Workload Storage
GPU Mart deploys NVMe SSD storage as high-performance workload storage across GPU VPS and most dedicated GPU server deployments. NVMe storage is primarily used for data processing, dataset storage, caching layers, and AI workload pipelines.
Advantages include:
- High IOPS performance for dataset loading and caching
- Low-latency data access for AI training and inference workflows
- Improved throughput for parallel compute pipelines
- Reduced GPU idle time caused by storage bottlenecks
NVMe storage is adopted as a baseline infrastructure standard to support modern AI and GPU-accelerated workloads.
RAID Storage Configuration (Dedicated Servers)
For dedicated GPU servers, RAID storage configurations may be deployed depending on workload and customer requirements.
Supported objectives include:
- Data redundancy and fault tolerance
- Increased aggregate storage throughput
- Improved data reliability for production workloads
RAID design is selected based on performance, redundancy, and capacity balancing requirements.
Compute & Network Performance Optimization
To ensure stable high-throughput performance, the platform adopts optimized virtualization drivers and network architecture:
- Virtio high-performance drivers for storage and networking
- Low-latency data path optimization
- Upgradeable network bandwidth up to 1 Gbps (availability depends on data center location)
- Dual-stack public networking support (IPv4 and IPv6)
These optimizations are designed to support latency-sensitive AI inference, distributed training coordination, and data-intensive compute workloads.
Operations & Compliance
GPU Mart maintains standardized operational controls. All GPU assets are centrally procured and lifecycle-managed through the DBM infrastructure system.
GPU resources are deployed under dedicated allocation policies with no oversubscription, supported by continuous infrastructure monitoring, tenant isolation controls, and baseline security and backup mechanisms.
Security & Infrastructure Stability
Platform stability and tenant isolation are maintained through multiple infrastructure protection mechanisms:
- Network-level tenant isolation within data center environments
- Baseline network threat mitigation and anti-mining protection
- Automated full-disk backups with retention cycles automatically scheduled based on the selected plan size. (typically 2-week or 4-week intervals)















