MoE Inference
latest
Benchmarking
Benchmarking
Design and Implementation
Design and Implementation
MoE Inference
Welcome to MoE Inference’s documentation!
Edit on GitHub
Welcome to MoE Inference’s documentation!
Benchmarking
Benchmarking
Evaluation plan
Orca approach
Proposed approach
Existing frameworks
Fairseq MoE
DeepSpeed-MoE
FasterTransformer MoE
MoE architectures
G-Shard
Fairseq’s 15B-parameters LM MoE model
Design and Implementation
Design and Implementation
Legion Overview
Tasks
Regions
Partitioning
Control Replication
Coherence
Mapping
Design
Implementation
Indices and tables
Index
Module Index
Search Page