Fine-tune LLM foundation models with the InstructLab an Open-Source project introduced by IBM and Red Hat

This blog post is a step-by-step guide to setting up the InstructLab CLI to run an LLM locally on an Apple Laptop with an Apple M3 chip. We’ll start with an overview of the InstructLab and its benefits, then move on to the detailed setup process.

In the IBM Research blog post The technology behind InstructLab, a low-cost way to customize LLMs, you can find the following information about InstructLab:

InstructLab was introduced by IBM and Red Hat as an Open-Source project to “lower the cost of fine-tuning large language models by allowing people to add new knowledge and skills to any model collaboratively.”

“Using a local version of InstructLab’s synthetic data generator, you can create instructions to align your models, experimenting until they perform the target task. Once a recipe has been perfected, you can submit it as a pull request to the InstructLab taxonomy on GitHub like any other open-source project.”

Here are the two supported models based on the information on 20. June 2024.

The blog post is structured in the following sections:

  1. Getting started
  2. Steps for an Apple M3 Laptop
    1. Step 1: Create a folder
    2. Step 2: Install ilab CLI
    3. Step 3: Add tab completion for the ilab command for Bash and Zsh shells
    4. Step 4: Init instruct lab
    5. Step 5: Using the default config
    6. Step 6: Download model
    7. Step 7: Verify the created folder and files with the command tree
    8. Step 8: Verify the content of the config.yaml
    9. Step 9: Download the model specified in the config.yaml
    10. Step 10: Serve the model
    11. Step 11: Open a new terminal from “instructlab” folder and chat with the model
  3. Summary

1. Getting started

“InstructLab uses a novel synthetic data-based alignment tuning method for Large Language Models (LLMs.) The “lab” in InstructLab 🐶 stands for Large-Scale Alignment for ChatBots“. (Resource InstructLab Taxonomy 2024/06/25) 

In the blog post IBM Research blog post The technology behind InstructLab, a low-cost way to customize LLMs, you can find the YouTube “InstructLab Demo: Lowering the barrier to AI model development,” which shows a practical introduction to the usage of I can recommend.

The following is a simplified high-level extraction from my point of view of the practical steps, which are more technical, to contribute to a model based on the information in the YouTube video:

  1. Prepare data for Taxonomy tree
  2. Validate data in Taxonomy tree
  3. Generate synthetic data
  4. Train model
  5. Convert model
  6. Serve model
  7. Test model
  8. Contribute changes

In short, we can say Taxonomy contains the initial and the updated data for the fine-tuning, and ilab does the fine-tuning.

  • ilab generates syntetic data for the training with Taxonomy
  • Taxonomy tree that will allow you to create models tuned with your data.

These are the three major repositories in the InstructLab project on GitHub:

  • InstructLab, the CLI for train, chatting, and running the model
  • Taxonomy, the data for the training.
  • Community to exchange on the relevant topics.

Note: The screen shot is from 21.06.2024.

2. Set up the instructLab CLI on an Apple Laptop with an Apple M3 chip

This step-by-step guide is an extract of the detailed instructions of the setup for the instructLab on GitHub to run a model locally using the instructLab to run on MacOS containing an Apple M3 chip.

Here is a 15 min YouTube video related to the setup.

2.1 Steps for an Apple M3 Laptop

Step 1: Create a folder

mkdir instructlab
cd instructlab

Step 2: Install ilab CLI

Currently, the InstructLab CLI uses Llama.cpp as LLM inference (Link 21.06.2024), so it is relevant to know when the GPUs on your machine are enabled and when they are not. (llama.cpp in InstructLab CLI (Link 21.06.2024)) and it uses Low-Rank Adaptation, a popular and lightweight training technique that significantly reduces the number of trainable parameters. “LoRA LoRA in InstructLab” (Link 25.06.2024).

Note: Llama.cpp on mac installation note: On MacOS, Metal is enabled by default. Using Metal makes the computation run on the GPU. To disable the Metal build at compile time use the LLAMA_NO_METAL=1 flag or the LLAMA_METAL=OFF cmake option.

python3 -m venv --upgrade-deps venv
source venv/bin/activate
pip cache remove llama_cpp_python
pip install instructlab

FYI: How to set up a virtual environment for Python

Step 3: Add tab completion for the ilab command for Bash and Zsh shells

  • Bash: For the automation for each new terminal instance, add to ~/.bashrc file.

FYI: Apple will not update Bash, because the latest version is licensed under GPLv3, which Apple cannot use. They have updated most of their other shells though. ZSH for example, is mostly up to date.

Source: macos – Update bash to version 4.0 on OSX – Ask Different. https://apple.stackexchange.com/questions/193411/update-bash-to-version-4-0-on-osx/197172

eval "$(_ILAB_COMPLETE=bash_source ilab)"
_ILAB_COMPLETE=bash_source ilab > ~/.ilab-complete.bash
echo ". ~/.ilab-complete.bash" >> ~/.bashrc
  • Zsh: For the automation for each new terminal instance, add to ~/.zshrc file.

The first two lines were added to avoid zsh: command not found: compdef.

autoload -Uz compinit
compinit
eval "$(_ILAB_COMPLETE=zsh_source ilab)"
_ILAB_COMPLETE=zsh_source ilab > ~/.ilab-complete.zsh
echo ". ~/.ilab-complete.zsh" >> ~/.zshrc

Step 4: Init instruct lab

ilab config init

Step 5: Using the default config

Please provide the following values to initiate the environment [press Enter for defaults]: <ENTER>

Step 6: Download the “https://github.com/instructlab/taxonomy.git” by pressing “y

`taxonomy` seems to not exist or is empty. Should I clone https://github.com/instructlab/taxonomy.git for you? [y/N]: <y>

Step 7: Verify the created folder and files with the command tree

  • Install tree
brew install tree
  • Run tree command for 1 level

You see, a new file config.yaml was generated.

tree -L 1 ./

Output:

.
├── config.yaml
├── taxonomy
└── venv

Step 8: Verify the content of the config.yaml

chat:
 context: default
 greedy_mode: false
 logs_dir: data/chatlogs
 max_tokens: null
 model: models/merlinite-7b-lab-Q4_K_M.gguf
 session: null
 vi_mode: false
 visible_overflow: true
general:
 log_level: INFO
generate:
 chunk_word_count: 1000
 model: models/merlinite-7b-lab-Q4_K_M.gguf
 num_cpus: 10
 num_instructions: 100
 output_dir: generated
 prompt_file: prompt.txt
 seed_file: seed_tasks.json
 taxonomy_base: origin/main
 taxonomy_path: taxonomy
serve:
 gpu_layers: -1
 host_port: 127.0.0.1:8000
 max_ctx_size: 4096
 model_path: models/merlinite-7b-lab-Q4_K_M.gguf

Step 9: Download the model specified in the config.yaml

In this case, the model name is: models/merlinite-7b-lab-Q4_K_M.gguf (GPT-Generated Unified Format) and model will be downloaded from HuggingFace.

ilab model download

Now we see that the models folder with the model was created.

tree -L 1 ./
./
├── config.yaml
├── models
├── taxonomy
└── venv

Step 10: Serve the model

source venv/bin/activate
ilab model serve
  • Output:
INFO 2024-06-20 18:05:09,972 serve.py:51: serve Using model 'models/merlinite-7b-lab-Q4_K_M.gguf' with -1 gpu-layers and 4096 max context size.
INFO 2024-06-20 18:05:10,212 server.py:218: server Starting server process, press CTRL+C to shutdown server...
INFO 2024-06-20 18:05:10,213 server.py:219: server After application startup complete see http://127.0.0.1:8000/docs for API.

Step 11: Access the model using the REST API

Open a browser and enter the URL http://127.0.0.1:8000/docs

Step 12: Using curl to interact with the served model

Enter for following curl command:

curl -X 'POST' \
  'http://127.0.0.1:8000/v1/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "prompt": "\n\n### Instructions:\nWhat is the capital of France?\n\n### Response:\n",
  "stop": [
    "\n",
    "###"
  ]
}'
  • Output
{"id":"cmpl-3e4d9c5e-e0e9-4fb1-b9b4-b79093fa0106","object":"text_completion","created":1720678017,"model":"./models/MERLINITE-7B-LAB-Q4_K_M.GGUF.gguf","choices":[{"text":"Paris","index":0,"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":21,"completion_tokens":3,"total_tokens":24}}

Step 13: Open a new terminal from “instructlab” folder and chat with the model

Note: Ensure you loaded the Python virtual environment in the new terminal!
Now, you can chat, and you can close it with the exit command.

source venv/bin/activate
ilab model chat         
╭──────────────────────────────────────────────────────────────────────────────── system ─────────────────────────────────────────────────────────────────────────────────╮
│ Welcome to InstructLab Chat w/ MODELS/MERLINITE-7B-LAB-Q4_K_M.GGUF (type /h for help)                                                                                   │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
>>>

3. Summary

This Open-Source project is an exciting start for Open-Source to maintain and develop foundation models, unlike the HuggingFace approach, where every little change leads to a new model. How to contribute will be an extra topic for future blog posts.

Here is the blog post about fine-tuning with data creation, training, testing, and verifying: InstructLab and Taxonomy tree: LLM Foundation Model Fine-tuning Guide | Musician Example


I hope this was useful to you and let’s see what’s next?

Greetings,

Thomas

#finetuning, #llm, #instructlab, #ai, #opensource

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.

Up ↑