2.1 Lab Overview & Setup

Now it’s your turn to get hands-on. In this lab, you will deploy the mock ServiceNow API and the Kubeflow Pipeline that ingests its data into our Milvus vector database.

To ensure you can get started quickly, we have pre-created a project for you and deployed some of the necessary components. This section will guide you through connecting to the environment, exploring your project, and running a simple test against the Large Language Model (LLM) provided for this workshop.

Getting Connected

For this workshop, we have provisioned a shared OpenShift Container Platform cluster with Red Hat OpenShift AI deployed on it. Each attendee has a unique user account.

Environment Information

If you are viewing these instructions in the deployed lab environment, the values below will be correctly rendered for you. If viewing from a static source like GitHub, placeholder values will appear.

Table 1. Lab Environment Access
Username	`userX`
Password	`openshift`

Login Procedure

Click the Login with OpenShift button at the OpenShift AI Dashboard.
Enter your user credentials (userX and openshift) provided above.

Your browser might display a security warning. It is safe to ignore this message for the lab environment.
After you authenticate, you will land on the OpenShift AI dashboard.

Congratulations, you are now connected!

Reviewing Your Pre-Created Project

A project has been pre-created for you. Let’s take a quick tour.

In the Red Hat OpenShift AI Dashboard, navigate to the Projects area and select the default Workspace that’s been created.

Inside your project, you will find a pre-configured Jupyter Workbench to perform model testing. Later, you’ll setup a Pipeline and store data in the Cluster storage.

Your dedicated user project home showing available components

Table 2. Project Components Overview
Component	Purpose
Workbench	Jupyter environment for model testing and development
Pipeline	Data ingestion and processing workflows (to be configured)
Cluster Storage	Persistent storage for data and models

These resources are all cloud-native, and to see the running pods for these services, you need to switch to the main OpenShift Console. On the left-hand side of your page, select the OpenShift Console tab.

The console may prompt you to sign in again. Use your credentials: userX & openshift.
First, make sure you have the userX project selected from the project dropdown at the top of the console. Then, using the left-hand menu, navigate to Workloads → Pods. Here you can see all the running components, like the Pipeline Server, database, and other pre-configured resources.

Interacting with a Large Language Model

Before we build our RAG pipeline, let’s connect to the local LLM that has been deployed for this workshop and ask it a basic question. This will help you get familiar with interacting with an LLM programmatically from your pre-created Jupyter environment via an OpenAI-compatible API.

The LLM service (granite-4-0-h-tiny) is already running in the another OpenShift AI cluster (MaaS) and accessible via API and API token.

Return to the OpenShift AI Dashboard browser tab and navigate to the Workbenches tab within your project. Launch the workbench (my-workbench) by clicking on its name.
Once JupyterLab opens, use the file browser on the left to navigate to the lab materials folder:
```
hello-chris-rag-pipeline/lab-content/3.1
```
Open the notebook named 03-01-nb-llm-example.ipynb. This notebook contains pre-written Python code to connect to and query the LLM.
Copy your API token into the openai_api_key variable, replacing the PASTE_YOUR_LITELLM_KEY_HERE string:
```
{litellm_virtual_key}
```

Execute the cells in the notebook one by one. Click the play button (▶️) next to each cell or use Shift+Enter to run them sequentially.

Executing notebook cells showing the play button and cell execution flow

The key cells perform the following actions:

Connect to the LLM: This cell defines the connection to the internal service for the granite-4.0-h-tiny model that is running in the cluster.

# LLM Inference Server URL
inference_server_url = "https://litellm-prod.apps.maas.redhatworkshops.io" (1)

# LLM definition using a client that speaks the OpenAI API format
llm = VLLMOpenAI(
    openai_api_key="{litellm_virtual_key}", (2)
    openai_api_base=f"{inference_server_url}/v1", (3)
    model_name="granite-4.0-h-tiny", (4)
    # ... other parameters
)

1	Internal Kubernetes service URL for the LLM inference server
2	Your personal API token (`{litellm_virtual_key}`) injected at runtime
3	OpenAI-compatible API endpoint format
4	Specific model identifier for the deployed Granite model

Define the Prompt: This cell creates a prompt template that instructs the LLM how to behave and formats the user’s question.
```
template="""<|system|>
You are a helpful, respectful and honest assistant. (1)
<|user|>
### QUESTION:
{input} (2)
### ANSWER:
<|assistant|>
"""
prompt = PromptTemplate(input_variables=["input"], template=template) (3)
```
1 System message defining the assistant’s behavior

2 Placeholder for user input that will be dynamically replaced

3 LangChain PromptTemplate object for structured prompt formatting
Ask a Question: The final cell defines a question and sends it to the LLM. You should see the answer streamed back directly in the notebook output.

Watch for the cell execution indicator [*] which shows when a cell is running, and [number] when it completes successfully.

Summary

Explored the pre-configured Data Science Project: workbench, pipeline server, and cluster storage provisioned per user
Switched between the Red Hat OpenShift AI Dashboard (ML-focused) and the OpenShift Container Platform Console (infrastructure-focused) to view the same resources from different perspectives
Queried a Granite LLM via the OpenAI-compatible /v1/chat/completions API — the same interface the RAG chain will use later

1	System message defining the assistant’s behavior
2	Placeholder for user input that will be dynamically replaced
3	LangChain PromptTemplate object for structured prompt formatting