Are you planning to create a MOE model with 100-200 billion parameters?

#3
by Kosh69 - opened

Unlike dense models, a MOE model can realistically be run locally on a CPU and GPU.

Sign up or log in to comment