Knowledge Distillation

4 selectedDifficulty 4-64 unseenView topic

Saved practice

Keep this quiz in your learner record

Answers count toward your profile, review queue, and next-topic suggestions. You can also use the quick practice below.

IntermediateNew

0 answered

4 intermediateAdapts to your performance

Question 1 of 4

120sintermediate (4/10)conceptual

Knowledge distillation (Hinton et al. 2015) trains a small 'student' model to match a large 'teacher' model's outputs. Why use soft teacher probabilities instead of just hard labels?