Introduction

Giskard LLM Evaluation Hub is a platform that centralizes the validation process of LLM applications. It enables product teams to ensure all functional, business, and legal requirements are met while maintaining close contact with the development team to avoid delayed deployment timelines.

The Hub can be deployed on-premise or in the cloud, depending on your specific needs.

Throughout this user guide, we'll use a banking app called Zephyr Bank, designed by data scientists. The app's chatbot provides customer service support on their website, offering knowledge about the bank's products, services, and more.

The Dashboard

The Dashboard is the first page you'll see upon logging in. It provides an overview of your project, displaying the number of models, datasets, evaluations, and knowledge bases.

It also features a graph showing the model's performance over time, measured by two metrics: Conformity and Correctness. By default, the bar graph displays Conformity—clicking the Correctness block switches the view to show Correctness data. We'll delve into these metrics in more detail in the Evaluations section.

Additionally, the dashboard lists your most recent evaluations and datasets for quick access.

Screen Shot 2024-10-16 at 17.42.41.png

Create a project

In this section, you will learn how to create a project. Before creating one, ensure you have properly configured the model (see Setup the model section).

Click the “Account” icon in the upper right corner of the screen, then select “Settings”. The Settings page allows you to manage your projects and users (if you have the proper access rights).

In the Projects tab, click the "Create project" button. A modal will appear where you can enter your project's name and description.