Accelerate Attention with FlashAttention-3: New Capabilities and Performance
FlashAttention-3 significantly accelerates attention in AI models, achieving 1.2 PFLOPS with FP8 and improving GPU performance.
·
2 views