Our chatty AI man
Get your Pi to chat like a human, mostly.
Our first test involves running the LLaMA payload – discussed in LXF304, O it creates a ChatGPT-like bot that runs locally. Due to the extremely large size of the model, which has to be obtained via BitTorrent, this task cannot be completed solely on the Raspberry Pi – instead, a workstation is required to handle some of the more computationally intensive preparation tasks.
Furthermore, be aware that the quantisation steps have to be performed once again even if you’ve done them in the past. The LLaMA framework receives frequent upgrades – trying to use a recent version with an ancient quantisation leads to errors.
Excursus: swap it
Since advanced operating systems became available on all kinds of workstations, expanding the working memory by using remanent memory in a process called swapping has become popular.
The Raspberry Pi’s RAM can be expanded almost unlimitedly by using swap – in practice, the main limitation is usually the microSD card. Such cards tend to not respond well to significant write loads.
In the following steps, we work around the problem via the contraption pictured (below). It is a no-screw housing filled with a 128GB SSD left over from a workstation. When connected to one of the Pi 4’s USB 3.0 ports, decent transfer rates can be achieved.