Inspired by an earlier issue, the pace of development in this space is overwhelming, and I'm sure this project will gain traction. It wouldn't be feasible for you to manually add support for everything users are requesting then.
Hence, would it be possible to support users choosing their own models, probably locally downloaded as GGML or GPTQ as is popular with Ooga Booga? This would also allow users flexibilty (i.e. scaling down to 1.3B models or up to 30+B models) according to their devices.
Inspired by an earlier issue, the pace of development in this space is overwhelming, and I'm sure this project will gain traction. It wouldn't be feasible for you to manually add support for everything users are requesting then.
Hence, would it be possible to support users choosing their own models, probably locally downloaded as GGML or GPTQ as is popular with Ooga Booga? This would also allow users flexibilty (i.e. scaling down to 1.3B models or up to 30+B models) according to their devices.