I have been playing around with Koboldcpp for writing stories and chats. It’s really easy to setup and run compared to Kobold ai. The best part is it runs locally and depending on the model, uncensored. The only downside is the memory requirements for some models and generation speed being around 65s with a 8gb model. You can use the included UI for stories or chats, but can be connected to Tavern AI for a Character AI like experience. This thread could be for model recommendations and character/story sharing.
Setting up Koboldcpp:
-
Download Koboldcpp and put the .exe in its own folder to keep organized.
-
Download a ggml model and put the .bin with Koboldcpp. I’ve used gpt4-x-alpaca-native-13B-ggml the most for stories but your can find other ggml models at Hugging Face. Generally the bigger the model the slower but better the responses are.
-
Open Koboldcpp and if you have a GPU select
CLBLast GPU #1
for faster generation. If it crashes during the first generation relaunch and leave it on OpenBLAS. Click launch and select the ggml model, after a little while it will open a new tab with the UI.
Tips:
-
Memory and Author’s note are important for coherent stories. Memory is for things that the Ai should always remember like character descriptions and places. Author’s note are for directing the Ai such as theme and story direction.
-
You can select presets under setting to change how the Ai reacts, I personally use Godlike for better descriptions.
-
In Scenarios there are some build-in ones but you can import ones from aetherroom.club
That is all, have fun with your own AI!
I will later on make a Tavern AI tutorial.