Add Your Knowledge

Most organizations that have been using OpenShift Container Platform for any length of time have developed their own policies, procedures, templates, and other organizationally-specific: information about OpenShift.

As you learned earler, OpenShift Lightspeed only uses the OpenShift documentation to support answers by default. So, how can you get your important internal knowledge into the system? Well, there’s a process for that.

This is currently a Technology Preview feature of OpenShift Lightspeed. The method of using podman to build and push a RAG database from local markdown files is just a first draft and will be replaced by a more sophisticated system in the future. If this feature is important to you, please take time to talk to us after the lab or schedule a meeting to discuss your particular scenarios.

Examine the documentation

In the web terminal, clone the lab repository:

cd
git clone https://github.com/rhpds/lb1344-showroom-openshift-lightspeed-and-rag-pipeline

The documentation is located in the ~/lb1344-showroom-openshift-lightspeed-and-rag-pipeline/byok folder, and, right now, consists of just two documents:

the specific ACME process for horizontal pod autoscaling
the specific ACME process for adding GPU nodes

Take a look at those files using cat, less, or your favorite tool to get an understanding of what they are discussing.

OpenShift CLI

You will need to perform a few commands using the OpenShift CLI (oc). This requires being logged into the cluster. Your terminal session should already be logged into the OpenShift cluster:

oc whoami

If you’re not logged in, that’s OK, but if you are logged in, and you see system:admin, you still need to perform the login steps. Why?

The system:admin user is a special user that uses a certificate for authentication, but some of the steps you need to perform need to be done as a user with a token for authentication, and you can’t get a token for system:admin, so you need to login with the following instructions.

CLI Login

You can use the following command to login:

oc login -u=admin -p={openshift_cluster_admin_password} --server={openshift_api_server_url}

Build the RAG Database Image

In the architecture slide, you saw how OpenShift Lightspeed used a RAG database inside the server that provided the relevant documentation to support user questions. Now, you will use a podman-based process to build another RAG database image that you will then tell OpenShift Lightspeed to use.

Building the RAG database is pretty trivial, as we have provided a pre-packaged container image with the required software.

If you have an extreme amount of internal documentation, the database creation process could take several minutes, as it involves a computationally-intensive process called embedding. GPUs can improve this time, but you are unlikely to be using this process very often (unless your internal documentation is changing very frequently).

Make sure to change your directory as follows:

cd ~/lb1344-showroom-openshift-lightspeed-and-rag-pipeline

You will want to create an output folder to hold the tarball/image created by this build process:

mkdir ~/lb1344-showroom-openshift-lightspeed-and-rag-pipeline/output

Then, run the following podman command to build the database:

podman run -e OUT_IMAGE_TAG=latest -it --rm --device=/dev/fuse \
-v ~/lb1344-showroom-openshift-lightspeed-and-rag-pipeline/byok:/markdown:Z \
-v ~/lb1344-showroom-openshift-lightspeed-and-rag-pipeline/output:/output:Z \
quay.io/thoraxe/byok-tool:250422

If that worked successfully, you should see something like:

...
Successfully tagged localhost/latest:latest
ea524f61b933d3fff369e5c55b27d9f845a378332e77a3fb0254c915f483b8da
Getting image source signatures
Copying blob 0ee7f3003527 done   |
Copying blob 1a4bd3506461 done   |
Copying blob 6554fa50686e done   |
Copying config ea524f61b9 done   |
Writing manifest to image destination

This will produce a tarball in the output folder that you created previously. Next you can load that tarball (to the local podman image store) and then push it into your OpenShift registry.

Load the image:

podman load -i ~/lb1344-showroom-openshift-lightspeed-and-rag-pipeline/output/latest.tar

Tag the Image

Next, tag the image appropriately so that it can be pushed to the registry:

podman tag localhost/latest:latest $(oc get route default-route \
-n openshift-image-registry \
--template='{{ .spec.host }}')/openshift-lightspeed/acme-byok:latest

To check if that completed successfully, list the images with:

podman images

And you should see your re-tagged image present.

Now that you have tagged the image, you will need to push it into the internal OpenShift Container Platform registry.

Push the Image

Now, login to the container registry with the following command:

podman login -u admin -p $(oc whoami -t) $(oc get route default-route \
-n openshift-image-registry --template='{{ .spec.host }}')

Lastly, push the image you just built:

podman push $(oc get route default-route \
-n openshift-image-registry \
--template='{{ .spec.host }}')/openshift-lightspeed/acme-byok:latest

Modify the OpenShift Lightspeed Configuration

Now that the image is available in an accessible container registry, you have to tell the OpenShift Lightspeed Operator to deploy the new RAG database alongside the existing one.

In the OpenShift Console, click Operators and then Installed Operators in the left hand navigation. Then, make sure to adjust the project dropdown to "All Projects" at the top of the screen.

Next, click the OpenShift Lightspeed operator in the list.

Next, click the OLSConfig tab, and then click the single OLSConfig instance called cluster in the list.

Finally, select the YAML tab.

In the YAML editor, you will want to insert the following yaml segment just before the status block. Your full YAML should look something like:

...
  ols:
...
    queryFilters:
      - name: ip-address
        pattern: '((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}'
        replaceWith: <IP-ADDRESS>
    rag:
      - image: 'image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/acme-byok:latest'
        indexID: vector_db_index
        indexPath: /rag/vector_db
status:
  conditions:
  ...

If you’re very familiar with YAML/JSON, the rag stanza is at .spec.ols.rag. The indentation is very important. The rag keyword should be at the same indentation as queryFilters.

    rag:
      - image: 'image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/acme-byok:latest'
        indexID: vector_db_index
        indexPath: /rag/vector_db

You can define multiple RAG databases this way, if you want to add multiple sources. The rag stanza just takes an array of image references.

Click the blue Save button.

Wait for Lightspeed

Click the Workloads navigation item on the left, then Pods. Next, find the openshift-lightspeed project in the dropdown (you will have to toggle the switch Show default namespaces).

Wait for the lightspeed-app-server-… API server pod to start and for both containers to be ready, and for the previous deployment’s containers to terminate and disappear.

Next, let’s test that it worked!