Thank you for the excellent plugin, I love it! I followed the guide and there is a more straight forward connection to Ollama, not requiring LiteLLM. Ollama servers openai compatible API already. Tested and working.

Offline chat model with ollama
Install ollama
Pick a LLM model to use from the ollama library and run ollama run MODELNAME (e.g., ollama run llama3:latest) in a terminal
Setting Advanced Value
Model: OpenAI API Key No Something, anything
Chat: Model No (online) OpenAI or compatible: custom model
Chat: Timeout (sec) Yes 600
Chat: OpenAI (or compatible) custom model ID Yes MODELNAME
Chat: Custom model is a conversation model Yes Yes
Chat: Custom model API endpoint Yes http:// ollama-IP :11434/chat/completions

1 Like