Stability AI’s new open-source chatbot uses ChatGPT’s special sauce



summary
Summary

Stability AI releases StableVicuna, the first large-scale open-source chatbot trained with human feedback.

Stability AI, the company behind the successful open-source image model Stable Diffusion, releases StableVicuna, an open-source chatbot. The chatbot is based on the Vicuna chatbot released in early April, which is a 13 billion parameter LLaMA model tuned with the Alpaca formula.

What is special about the Vicuna variant of Stability AI and Carper AI is that the model was improved using so-called “Reinformcent Learning with Human Feedback” (RLHF) (see below for explanation).

This was done using datasets from OpenAssistant, Anthropic, and Stanford University, as well as the open-source training framework rlX, also from Carper AI. Stability AI is working with OpenAssistant on larger RLHF datasets for future models.

ad

According to Stability AI, StableVicuna does simple math in addition to text generation and can write code. In common benchmarks, StableVicuna is on par with previously released open-source chatbots. However, benchmarks are only partially indicative of how a model will perform in practice.

Picture: Stability AI

According to Stability AI, StableVicuna will be developed further and launched on Discord soon. A demo is now available on HuggingFace. Stability AI also plans to make StableVicuna available through a chat interface soon.

Developers can download the model’s weights as a delta to the original LLaMA model at Hugging Face. Those who want to use StableVicuna themselves will need access to the original LLaMA, which can be requested here. Commercial use is not allowed.

The problem with open-source chatbots that are refined with generated chatbot data is the risk of an echo chamber, in which the AI ​​models reinforce their existing errors and biases through ever new training processes. In addition, training data generated for fine-tuning can reinforce hallucinations if it contains information not present in the original model.

Recommendation

Stability AI’s Github.



Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top