Discussion about this post

User's avatar
Alice E's avatar

Oh I see - they just don't want to "Scale reduces bias faster than variance. Larger models learn the correct objective more quickly than they learn to reliably pursue it. The gap between "knowing what to do" and "consistently doing it" grows with scale."

Also me!

Alice E's avatar

Very good advice on checks 👌

10 more comments...

No posts

Ready for more?