Course contentsShow
Machine Learning and Deep Learning
Lesson 1078 of 3,53824. The Transformer ArchitecturePro lesson

Cross-Attention vs. Self-Attention Heads

How multi-head attention is used differently in encoder-decoder architectures for source-target alignment.

This lesson is for subscribers

You've completed the free preview. Subscribe to unlock every lesson in every course.