🚀 Assignment Overview
This assignment has two main components:
- Attention.py - Implement attention mechanisms from basic to multi-head (50 points)
- gpt2_model.py - Build GPT-2 components with modern optimizations (50 points)
Complete all TODO sections in the starter code. The implementations should follow the hints and maintain the original function signatures.