What's LLaMA 3?
LLaMA 3 is a type of artificial intelligence (AI) model developed by Meta AI, a research laboratory that focuses on natural language processing (NLP) and other AI-related areas.
What makes LLaMA 3 special is its ability to understand and respond to a wide range of topics and questions, often with a high degree of accuracy and coherence. It's been trained on a massive dataset of text from the internet and can adapt to different contexts and styles.
Key features of LLaMA 3
LLaMA 3 has many potential applications, such as chatbots, virtual assistants, language translation, and content generation. It's an exciting development in the field of AI, and I'm happy to chat with you more about it!
Conversational dialogue: LLaMA 3 can engage in natural-sounding conversations, using context and understanding to respond to questions and statements.
Knowledge retrieval: It can access a vast knowledge base to provide accurate information on a wide range of topics.
Common sense: LLaMA 3 has been designed to understand common sense and real-world concepts, making its responses more relatable and human-like.
Fine-tuned and optimized: Llama 3 instruction-tuned models are fine-tuned and optimized for dialogue/chat use cases and outperform many of the available open-source chat models on common benchmarks.
The most capable model
Llama 3 represents a large improvement over Llama 2 and other openly available models:
Trained on a dataset seven times larger than Llama 2
Double the context length of 8K from Llama 2
Encodes language much more efficiently using a larger token vocabulary with 128K tokens
Less than 1⁄3 of the false “refusals” when compared to Llama 2
How to run LLaMA 3 with Ollama
CLI
Open the terminal and run ollama run llama3
The initial release of Llama 3 includes two sizes:8B and 70B parameters:
# 8B Parameters ollama run llama3:8b # 70B Parameters ollama run llama3:70b
API
Example using curl:
curl -X POST http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt":"Why is the sky blue?" }'
Model variants
Instruct is fine-tuned for chat/dialogue use cases. Example:
ollama run llama3 ollama run llama3:70b
Pre-trained is the base model. Example:
ollama run llama3:text ollama run llama3:70b-text
References
Express GPU VPS - K620
- 12GB RAM
- Dedicated GPU: Quadro K620
- 9 CPU Cores
- 160GB SSD
- 100Mbps Unmetered Bandwidth
- OS: Linux / Windows 10/ Windows 11
- Once per 4 Weeks Backup
- Single GPU Specifications:
- CUDA Cores: 384
- GPU Memory: 2GB DDR3
- FP32 Performance: 0.863 TFLOPS
Lite GPU Dedicated Server - K620
- 16GB RAM
- GPU: Nvidia Quadro K620
- Quad-Core Xeon E3-1270v3
- 120GB + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Maxwell
- CUDA Cores: 384
- GPU Memory: 2GB DDR3
- FP32 Performance: 0.863 TFLOPS
- Ideal for lightweight Android emulators, small LLMs, graphic processing, and more. Powerful than GPU VPS.
Promotion Rules:
1)Limit one server per customer
2)One-time discount only (non-recurring)
Express GPU Dedicated Server - P620
- 32GB RAM
- GPU: Nvidia Quadro P620
- Eight-Core Xeon E5-2670
- 120GB + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Pascal
- CUDA Cores: 512
- GPU Memory: 2GB GDDR5
- FP32 Performance: 1.5 TFLOPS
Professional GPU VPS - A4000
- 32GB RAM
- 24 CPU Cores
- 320GB SSD
- 300Mbps Unmetered Bandwidth
- Once per 2 Weeks Backup
- OS: Linux / Windows 10/ Windows 11
- Dedicated GPU: Quadro RTX A4000
- CUDA Cores: 6,144
- Tensor Cores: 192
- GPU Memory: 16GB GDDR6
- FP32 Performance: 19.2 TFLOPS
Advanced GPU Dedicated Server - A5000
- 128GB RAM
- GPU: Nvidia Quadro RTX A5000
- Dual 12-Core E5-2697v2
- 240GB SSD + 2TB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 8192
- Tensor Cores: 256
- GPU Memory: 24GB GDDR6
- FP32 Performance: 27.8 TFLOPS
Enterprise GPU Dedicated Server - RTX A6000
- 256GB RAM
- GPU: Nvidia Quadro RTX A6000
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 10,752
- Tensor Cores: 336
- GPU Memory: 48GB GDDR6
- FP32 Performance: 38.71 TFLOPS
Multi-GPU Dedicated Server - 3xV100
- 256GB RAM
- GPU: 3 x Nvidia V100
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Volta
- CUDA Cores: 5,120
- Tensor Cores: 640
- GPU Memory: 16GB HBM2
- FP32 Performance: 14 TFLOPS
Enterprise GPU Dedicated Server - A100
- 256GB RAM
- GPU: Nvidia A100
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 6912
- Tensor Cores: 432
- GPU Memory: 40GB HBM2
- FP32 Performance: 19.5 TFLOPS