The popularity of llama.cpp and optimized GGUF format for models is growing. This post outlines steps to run "Phi-3-Small-128K-Instruct" in GGUF format with llama.cpp on an IBM Cloud VSI with GPUs and Ubuntu 22.04. It covers VSI setup, CUDA toolkit, compilation, Python environment, model usage, and additional resources.
Getting started with Text Generation Inference (TGI) using a container to serve your LLM model
This blog post outlines a bash automation for setting up and testing Text Generation Inference (TGI) using a container. It provides instructions for creating a Python test client, starting the TGI server, and troubleshooting common issues. The post emphasizes the benefits of using containers and references the Hugging Face and Nvidia technologies.
How to create a FastAPI server to use OpenAI models
Last time, I wrote a blog post about "IBM Watsonx.ai and a simple question-answering pipeline using Python and FastAPI", and I had an exchange with my family about an OpenAI sample for a FastAPI application, so I created a small FastAPI server to access OpenAI with Python.
How to set up Caikit and use Hugging Face models examples
This small blog post is about how to set up a demo environment for using Caikit and Hugging Face models on your local machine.
Show the collection IDs of IBM Cloud Watson Discovery projects using cURL
This blog post is a simple example (cheat sheet) of listing the collections for a project in Watson Discovery using cURL and the IBM Cloud Watson Discovery API V2. You can get more details in the IBM Cloud Watson Discovery API documentation. 1. Log on to IBM Cloud ibmcloud login (-sso) REGION=us-south GROUP=default ibmcloud target -r... Continue Reading →
Some thoughts about ChatGPT and AI
Everyone is now talking about this new way of using AI in an interactive form of communication. When we talk about free or open and AI, these three questions immediately came to my mind: “If you're not paying for the product, then you are the product.” That's a quote from Daniel Hövermann in The Social Dilemma. What will be the business model? "Will my Job be replaced?" "Can I trust, and what is my remaining responsibility?"
Watson NLP for Embed customize a classification model and use it on your local machine
This blog post is about, how to customize a classification model for Watson NLP for Embed and use it on your local machine.
Create a custom dictionary model for Watson NLP
This blog post is about, how to create a custom dictionary model for Watson NLP. One capability of the Watson NLP is the "Entity extraction to find mentions of entities (like person, organization, or date)." We will adapt the Watson NLP model to extract entities from a given text to find single entities like names and locations which are identified by an entry and its label.
Run Watson NLP for Embed on IBM Cloud Code Engine
This blog post is about using the IBM Watson Natural Language Processing Library for Embed on IBM Cloud Code Engine and is related to my blog post Run Watson NLP for Embed on your local computer with Docker. IBM Cloud Code Engine is a fully managed, serverless platform where you can run container images or batch jobs.
Run Watson NLP for Embed on your local computer with Docker
This blog post is about using the IBM Watson Natural Language Processing Library for Embed on your local computer with Docker. The IBM Watson Libraries for Embed are made for IBM Business Partners. Partners can get additional details about embeddable AI on the IBM Partner World page. If you are an IBM Business Partner you can get a free access to the IBM Watson Natural Language Processing Library for Embed. To get started with the libraries you can use the link Watson Natural Language Processing Library for Embed home. It is an awesome documentation and it is public available.
