Curated collection of thoughts and builds centered around Apple Silicon.
Why unified memory architecture is the only way to run 70B parameter models without a data-center budget.