llamero

What is llamero?

Simply put, llamero is a shard for Crystal that allows you to interact with llama.cpp models from within your application.

Here's a basic example:

require "llamero"

model = Llamero::BaseModel.new(model_name: "meta-llama-3-8b-instruct-Q6_K.gguf")

puts model.quick_chat([{ role: "user", content: "Hey Llama! Tell me your best joke about programming" }])

Before you start

Currently, you will need to clone the llama.cpp repo, build it and symlink the bin to /usr/local/bin/llamacpp for this shard to work as intended.

You will also need python 3.12 or later and pip

brew install python3 pip

Then you can clone and build llama.cpp

Important Note: these instructions tie you to an older release of llama.cpp due to a bug that was introduced around late Feb 2024 - March 2024. This bug has not been fixed as of yet, which breaks this shard entirely because the llama.cpp binary will not execute from the symbolic link we want to create for running it outside of the llama.cpp directory.

cd ~/ && git clone [email protected]:ggerganov/llama.cpp.git && cd llama.cpp && git fetch --tags && git checkout f1a98c52 && make

You will now be on a stable version of llama.cpp and able to make the symbolic link to run this shard. You will be in a detached HEAD state, so you will need to checkout the f1a98c52 commit if you intend to switch to master/main or another release.

Now create the symlink for the main binary, run this from within the llama.cpp directory root

For Mac users, this command will create a symlink for you

sudo ln -s $(pwd)/llama.cpp/main /usr/local/bin/llamacpp

Next we'll link the tokenizer

sudo ln -s $(pwd)/tokenize /usr/local/bin/llamatokenize

You will also need to download some models. This is a quick reference list. You can choose any model that's already quantized into gguf, or you can convert your own models using the llama.cpp quantization tool.

Choose a model from below to start with. The links should bring you directly to the model files page. You want to "download" the model file.

Model Name	Description	RAM Required	Prompt Template
Mixtral dolphin-2.7-mixtral-8x7b-GGUF	A quantized model optimized for 8x7b settings, works about as well as ChatGPT 4	~27GB	chat template
Mistril-7B-instruct-v0.2-GGUF	A quantized model from Mistril, works about as well as ChatGPT 3.5	~6GB	chat template
Llama3 8b-Instruct-GGUF	A quantized model from Llama 3, works about as well as GPT-4 but limited knowledge	~8GB	chat template

You can always download a different model, it just needs to be in the GGUF quantized format, or you'll need to quantize the model from llama.cpp's quantization tool.

Move the model you downloaded into a directory that you'll configure in your project to use. I recommend ~/models as this is the default directory that Llamero will check for models.

Installation

Add the dependency to your shard.yml:

dependencies:
  llamero:
    github: crimson-knight/llamero

Run shards install

Usage

require "llamero"

TODO: Write usage instructions here

Development

To Do: [] Generate chat templates by reading from the model (integrate with HF's C-lib)

TODO: Write development instructions here

Contributing

Fork it (https://github.com/crimson-knight/llamero/fork)
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Add some feature')
Push to the branch (git push origin my-new-feature)
Create a new Pull Request

Contributors

crimson-knight - creator and maintainer

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
ai_docs		ai_docs
examples		examples
spec		spec
src		src
tmp		tmp
.DS_Store		.DS_Store
.editorconfig		.editorconfig
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.log		main.log
shard.yml		shard.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llamero

Before you start

Installation

Usage

Development

Contributing

Contributors

About

Releases

Packages

Languages

License

crimson-knight/llamero

Folders and files

Latest commit

History

Repository files navigation

llamero

Before you start

Installation

Usage

Development

Contributing

Contributors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages