diff --git a/doc/development/ai_architecture.md b/doc/development/ai_architecture.md index db6ad89469f83d1b4142a8d047d72340a9d89706..974d5bf30c353d5e5270505b0e846e1ee8e8296c 100644 --- a/doc/development/ai_architecture.md +++ b/doc/development/ai_architecture.md @@ -28,7 +28,7 @@ There are two primary reasons for this: the best AI models are cloud-based as th The AI Gateway (formerly the [model gateway](https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist)) is a standalone-service that will give access to AI features to all users of GitLab, no matter which instance they are using: self-managed, dedicated or GitLab.com. The SaaS-based AI abstraction layer will transition to connecting to this gateway, rather than accessing cloud-based providers directly. Calls to the AI-gateway from GitLab-rails can be made using the -[Abstraction Layer](ai_features/index.md#abstraction-layer). +[Abstraction Layer](ai_features/index.md#feature-development-abstraction-layer). By default, these actions are performed asynchronously via a Sidekiq job to prevent long-running requests in Puma. It should be used for non-latency sensitive actions due to the added latency by Sidekiq. diff --git a/doc/development/ai_features/duo_chat.md b/doc/development/ai_features/duo_chat.md index 98e5be88d2d60e00fb4771f67704199285708145..21c7dc31792a875f1d0f0b8ec1964f3b2902aaa6 100644 --- a/doc/development/ai_features/duo_chat.md +++ b/doc/development/ai_features/duo_chat.md @@ -21,8 +21,8 @@ Rails backend sends then instructions to the Large Language Model (LLM) via the There is a difference in the setup for Saas and self-managed instances. We recommend to start with a process described for SaaS-only AI features. -1. [Setup SaaS-only AI features](index.md#test-saas-only-ai-features-locally). -1. [Setup self-managed AI features](index.md#test-ai-features-with-ai-gateway-locally). +1. [Setup SaaS-only AI features](index.md#saas-only-features). +1. [Setup self-managed AI features](index.md#local-setup). ## Working with GitLab Duo Chat @@ -40,12 +40,12 @@ If you find an undocumented issue, you should document it in this section after | Problem | Solution | | ----------------------------------------------------- | -------- | | There is no Chat button in the GitLab UI. | Make sure your user is a part of a group with enabled Experimental and Beta features. | -| Chat replies with "Forbidden by auth provider" error. | Backend can't access LLMs. Make sure your [AI Gateway](index.md#test-ai-features-with-ai-gateway-locally) is setup correctly. | +| Chat replies with "Forbidden by auth provider" error. | Backend can't access LLMs. Make sure your [AI Gateway](index.md#local-setup) is setup correctly. | | Requests takes too long to appear in UI | Consider restarting Sidekiq by running `gdk restart rails-background-jobs`. If that doesn't work, try `gdk kill` and then `gdk start`. Alternatively, you can bypass Sidekiq entirely. To do that temporary alter `Llm::CompletionWorker.perform_async` statements with `Llm::CompletionWorker.perform_inline` | ## Contributing to GitLab Duo Chat -From the code perspective, Chat is implemented in the similar fashion as other AI features. Read more about GitLab [AI Abstraction layer](index.md#abstraction-layer). +From the code perspective, Chat is implemented in the similar fashion as other AI features. Read more about GitLab [AI Abstraction layer](index.md#feature-development-abstraction-layer). The Chat feature uses a [zero-shot agent](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/gitlab/llm/chain/agents/zero_shot/executor.rb) that includes a system prompt explaining how the large language model should interpret the question and provide an answer. The system prompt defines available tools that can be used to gather @@ -181,7 +181,7 @@ REAL_AI_REQUEST=1 bundle exec rspec ee/spec/lib/gitlab/llm/completions/chat_real ``` When you update the test questions that require documentation embeddings, -make sure you [generate a new fixture](index.md#use-embeddings-in-specs) and +make sure you [generate a new fixture](index.md#using-in-specs) and commit it together with the change. ## Testing with CI diff --git a/doc/development/ai_features/index.md b/doc/development/ai_features/index.md index 0a99b7553cdf5286fe8d3e2d9be1cd4c5e83cd96..ce4705cc36359b3223e1948c92fe95211591b1ea 100644 --- a/doc/development/ai_features/index.md +++ b/doc/development/ai_features/index.md @@ -8,64 +8,11 @@ info: Any user with at least the Maintainer role can merge updates to this conte [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/117296) in GitLab 15.11. -## Feature flags +## Get started -Apply the following feature flags to any AI feature work: - -- A general flag (`ai_duo_chat_switch`) that applies to all GitLab Duo Chat features. It's enabled by default. -- A general flag (`ai_global_switch`) that applies to all other AI features. It's enabled by default. -- A flag specific to that feature. The feature flag name [must be different](../feature_flags/index.md#feature-flags-for-licensed-features) than the licensed feature name. - -See the [feature flag tracker epic](https://gitlab.com/groups/gitlab-org/-/epics/10524) for the list of all feature flags and how to use them. - -## Implement a new AI action - -To implement a new AI action, connect to the preferred AI provider. You can connect to this API using either the: - -- Experimental REST API. -- Abstraction layer. - -All AI features are experimental. - -## Test self-managed AI features locally - -Skip to [AI Gateway Setup](#test-ai-features-with-ai-gateway-locally) - -## Test SaaS-only AI features locally - -**Automated setup** - -Replace`<test-group-name>` with the group name you want to enable GitLab Duo features. -If the group doesn't exist, it creates a new one. -You might need to re-run the script multiple times, -it will print useful error messages with links to the docs on how to resolve the error. - -```shell -GITLAB_SIMULATE_SAAS=1 RAILS_ENV=development bundle exec rake 'gitlab:duo:setup[<test-group-name>]' -``` - -**Manual way** - -1. Ensure you have followed [the process to obtain an EE license](https://handbook.gitlab.com/handbook/developer-onboarding/#working-on-gitlab-ee-developer-licenses) for your local instance and you applied Ultimate license. - 1. To verify that the license is applied go to **Admin Area** > **Subscription** and check the subscription plan. -1. Allow use of EE features for your instance. - 1. Go to **Admin Area** > **Settings** > **General** -> **Account and limit** - 1. Enable **Allow use of licensed EE features** -1. Simulate the GDK to [simulate SaaS](../ee_features.md#simulate-a-saas-instance). -1. Ensure the group you want to test has an Ultimate license. - 1. Go to **Admin Area** > **Overview** > **Groups** - 1. Select **Edit** for your chosen group. - 1. Go to **Permissions and group features** - 1. Choose *Ultimate* from the **Plan** list. -1. Enable `Experiment & Beta features` for your group. - 1. Go to the group with the Ultimate license - 1. **Group Settings** > **General** -> **Permissions and group features** - 1. Enable **Experiment & Beta features** -1. Enable the specific feature flag for the feature you want to test -1. You can use Rake task `rake gitlab:duo:enable_feature_flags` to enable all feature flags that are assigned to group AI Framework -1. Setup [AI Gateway](#test-ai-features-with-ai-gateway-locally) +### Access -### Configure GCP Vertex access +#### GCP Vertex In order to obtain a GCP service key for local development, follow the steps below: @@ -86,107 +33,21 @@ In order to obtain a GCP service key for local development, follow the steps bel Gitlab::CurrentSettings.update(vertex_ai_project: PROJECT_ID) ``` -### Configure Anthropic access - -```ruby -Gitlab::CurrentSettings.update!(anthropic_api_key: <insert API key>) -``` - -### Embeddings database - -Embeddings are generated through the [VertexAI text embeddings API](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings). - -Embeddings for GitLab documentation are updated based on the latest changes -Monday through Friday at 05:00 UTC when the -[embeddings cron job](https://gitlab.com/gitlab-org/gitlab/-/blob/6742f6bd3970c56a9d5bcd31e3d3dff180c97088/config/initializers/1_settings.rb#L817) runs. - -The sections below explain how to populate embeddings in the DB or extract -embeddings to be used in specs. - -#### Set up +#### Anthropic -1. Enable [`pgvector`](https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/main/doc/howto/pgvector.md#enable-pgvector-in-the-gdk) in GDK -1. Enable the embedding database in GDK - - ```shell - gdk config set gitlab.rails.databases.embedding.enabled true - ``` - -1. Run `gdk reconfigure` -1. Run database migrations to create the embedding database in the `gitlab` folder of the GDK - - ```shell - RAILS_ENV=development bin/rails db:migrate - ``` - -#### Populate - -Seed your development database with the embeddings for GitLab Documentation -using this Rake task: - -```shell -RAILS_ENV=development bundle exec rake gitlab:llm:embeddings:vertex:seed -``` - -This Rake Task populates the embeddings database with a vectorized -representation of all GitLab Documentation. The file the Rake Task uses as a -source is a snapshot of GitLab Documentation at some point in the past and is -not updated regularly. As a result, it is helpful to know that this seed task -creates embeddings based on GitLab Documentation that is out of date. Slightly -outdated documentation embeddings are sufficient for the development -environment, which is the use-case for the seed task. - -When writing or updating tests related to embeddings, you may want to update the -embeddings fixture file: - -```shell -RAILS_ENV=development bundle exec rake gitlab:llm:embeddings:vertex:extract_embeddings -``` - -#### Use embeddings in specs - -The `seed` Rake Task populates the development database with embeddings for all GitLab -Documentation. The `extract_embeddings` Rake Task populates a fixture file with a subset -of embeddings. - -The set of questions listed in the Rake Task itself determines -which embeddings are pulled into the fixture file. For example, one of the -questions is "How can I reset my password?" The `extract_embeddings` Task -pulls the most relevant embeddings for this question from the development -database (which has data from the `seed` Rake Task) and saves those embeddings -in `ee/spec/fixtures/vertex_embeddings`. This fixture is used in tests related -to embeddings. - -If you would like to change any of the questions supported in embeddings specs, -update and re-run the `extract_embeddings` Rake Task. - -In the specs where you need to use the embeddings, -use the RSpec `:ai_embedding_fixtures` metadata. +[After filling out an access request](https://gitlab.com/gitlab-com/team-member-epics/access-requests/-/issues/new?issuable_template=AI_Access_Request), you can sign up for an Anthropic account and create an API key. You will then configure it: ```ruby -context 'when asking about how to use GitLab', :ai_embedding_fixtures do - # ...examples -end +Gitlab::CurrentSettings.update!(anthropic_api_key: <insert API key>) ``` -### Tips for local development - -1. When responses are taking too long to appear in the user interface, consider restarting Sidekiq by running `gdk restart rails-background-jobs`. If that doesn't work, try `gdk kill` and then `gdk start`. -1. Alternatively, bypass Sidekiq entirely and run the chat service synchronously. This can help with debugging errors as GraphQL errors are now available in the network inspector instead of the Sidekiq logs. To do that temporary alter `Llm::CompletionWorker.perform_async` statements with `Llm::CompletionWorker.perform_inline` - -### Working with GitLab Duo Chat - -View [guidelines](duo_chat.md) for working with GitLab Duo Chat. - -## Test AI features with AI Gateway locally +### Local setup > - [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/11251) in GitLab 16.8. In order to develop an AI feature that is compatible with both SaaS and Self-managed GitLab instances, the feature must request to the [AI Gateway](../../architecture/blueprints/ai_gateway/index.md) instead of directly requesting to the 3rd party model providers. -### Setup - 1. Setup CustomersDot (optional, not required for Chat feature): 1. Install CustomersDot: [internal video tutorial](https://youtu.be/_8wOMa_yGSw) - This video loosely follows [official installation steps](https://gitlab.com/gitlab-org/customers-gitlab-com/-/blob/main/doc/setup/installation_steps.md) @@ -264,12 +125,48 @@ Gitlab::Llm::AiGateway::Client.new(User.first).stream(prompt: "\n\nHuman: Hi, ho If you can't fetch the response, check `graphql_json.log`, `sidekiq_json.log`, `llm.log` or `modelgateway_debug.log` if it contains error information. -### Temporary workaround to avoid AI Gateway setup +### SaaS-only features + +These features do not use the AI Gateway and instead reach out to the LLM provider directly because they are not yet following the [architecture blueprint](../../architecture/blueprints/ai_gateway/index.md). [We are planning on](https://gitlab.com/groups/gitlab-org/-/epics/13024) moving these features to our self managed offering, so any features developed under this setup will be migrated over time. + +**Automated setup** + +Replace`<test-group-name>` with the group name you want to enable GitLab Duo features. +If the group doesn't exist, it creates a new one. +You might need to re-run the script multiple times, +it will print useful error messages with links to the docs on how to resolve the error. + +```shell +GITLAB_SIMULATE_SAAS=1 RAILS_ENV=development bundle exec rake 'gitlab:duo:setup[<test-group-name>]' +``` + +**Manual way** + +1. Ensure you have followed [the process to obtain an EE license](https://handbook.gitlab.com/handbook/developer-onboarding/#working-on-gitlab-ee-developer-licenses) for your local instance and you applied Ultimate license. + 1. To verify that the license is applied go to **Admin Area** > **Subscription** and check the subscription plan. +1. Allow use of EE features for your instance. + 1. Go to **Admin Area** > **Settings** > **General** -> **Account and limit** + 1. Enable **Allow use of licensed EE features** +1. Simulate the GDK to [simulate SaaS](../ee_features.md#simulate-a-saas-instance). +1. Ensure the group you want to test has an Ultimate license. + 1. Go to **Admin Area** > **Overview** > **Groups** + 1. Select **Edit** for your chosen group. + 1. Go to **Permissions and group features** + 1. Choose *Ultimate* from the **Plan** list. +1. Enable `Experiment & Beta features` for your group. + 1. Go to the group with the Ultimate license + 1. **Group Settings** > **General** -> **Permissions and group features** + 1. Enable **Experiment & Beta features** +1. Enable the specific feature flag for the feature you want to test +1. You can use Rake task `rake gitlab:duo:enable_feature_flags` to enable all feature flags that are assigned to group AI Framework +1. Setup [AI Gateway](#local-setup) + +### Bypass AI Gateway NOTE: You need to setup AI Gateway since GitLab 16.8. It's a recommended way to test AI features. Sending requests directly to LLMs could lead to unnoticed bugs. -Use this workaround with caution. +Use this workaround with caution. To setup direct requests to LLMs you have to: @@ -279,11 +176,23 @@ To setup direct requests to LLMs you have to: echo "Feature.disable(:gitlab_duo_chat_requests_to_ai_gateway)" | rails c ``` -1. Set the required access token. To receive an access token: - 1. For Vertex, follow the [instructions below](#configure-gcp-vertex-access). - 1. For Anthropic, create [an access request](https://gitlab.com/gitlab-com/team-member-epics/access-requests/-/issues/new). +### Help + +- [Here's how to reach us!](https://handbook.gitlab.com/handbook/engineering/development/data-science/ai-powered/ai-framework/#-how-to-reach-us) -## Experimental REST API +## Feature development (Abstraction Layer) + +### Feature flags + +Apply the following feature flags to any AI feature work: + +- A general flag (`ai_duo_chat_switch`) that applies to all GitLab Duo Chat features. It's enabled by default. +- A general flag (`ai_global_switch`) that applies to all other AI features. It's enabled by default. +- A flag specific to that feature. The feature flag name [must be different](../feature_flags/index.md#feature-flags-for-licensed-features) than the licensed feature name. + +See the [feature flag tracker epic](https://gitlab.com/groups/gitlab-org/-/epics/10524) for the list of all feature flags and how to use them. + +### Experimental REST API Use the [experimental REST API endpoints](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/api/ai/experimentation) to quickly experiment and prototype AI features. @@ -303,8 +212,6 @@ Feature.enable(:ai_experimentation_api) On production, the experimental endpoints are only available to GitLab team members. Use a [GitLab API token](../../user/profile/personal_access_tokens.md) to authenticate. -## Abstraction layer - ### GraphQL API To connect to the AI provider API using the Abstraction Layer, use an extendable GraphQL API called @@ -600,15 +507,99 @@ Gitlab::Llm::VertexAi::Client.new(user) Gitlab::Llm::Anthropic::Client.new(user) ``` -### Monitoring Ai Actions +### Add AI Action to GraphQL -- Error ratio and response latency apdex for each Ai action can be found on [Sidekiq Service dashboard](https://dashboards.gitlab.net/d/sidekiq-main/sidekiq-overview?orgId=1) under **SLI Detail: `llm_completion`**. -- Spent tokens, usage of each Ai feature and other statistics can be found on [periscope dashboard](https://app.periscopedata.com/app/gitlab/1137231/Ai-Features). +TODO -### Add Ai Action to GraphQL +## Embeddings database -TODO +Embeddings are required to be generated for chat documentation tool to work. Documentation tool works on Saas only at this point. + +Embeddings are generated through the [VertexAI text embeddings API](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings). + +Embeddings for GitLab documentation are updated based on the latest changes +Monday through Friday at 05:00 UTC when the +[embeddings cron job](https://gitlab.com/gitlab-org/gitlab/-/blob/6742f6bd3970c56a9d5bcd31e3d3dff180c97088/config/initializers/1_settings.rb#L817) runs. + +The sections below explain how to populate embeddings in the DB or extract +embeddings to be used in specs. + +### Set up + +1. Enable [`pgvector`](https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/main/doc/howto/pgvector.md#enable-pgvector-in-the-gdk) in GDK +1. Enable the embedding database in GDK + + ```shell + gdk config set gitlab.rails.databases.embedding.enabled true + ``` + +1. Run `gdk reconfigure` +1. Run database migrations to create the embedding database in the `gitlab` folder of the GDK + + ```shell + RAILS_ENV=development bin/rails db:migrate + ``` + +### Populate + +Seed your development database with the embeddings for GitLab Documentation +using this Rake task: + +```shell +RAILS_ENV=development bundle exec rake gitlab:llm:embeddings:vertex:seed +``` + +This Rake Task populates the embeddings database with a vectorized +representation of all GitLab Documentation. The file the Rake Task uses as a +source is a snapshot of GitLab Documentation at some point in the past and is +not updated regularly. As a result, it is helpful to know that this seed task +creates embeddings based on GitLab Documentation that is out of date. Slightly +outdated documentation embeddings are sufficient for the development +environment, which is the use-case for the seed task. + +When writing or updating tests related to embeddings, you may want to update the +embeddings fixture file: + +```shell +RAILS_ENV=development bundle exec rake gitlab:llm:embeddings:vertex:extract_embeddings +``` + +### Using in specs + +The `seed` Rake Task populates the development database with embeddings for all GitLab +Documentation. The `extract_embeddings` Rake Task populates a fixture file with a subset +of embeddings. + +The set of questions listed in the Rake Task itself determines +which embeddings are pulled into the fixture file. For example, one of the +questions is "How can I reset my password?" The `extract_embeddings` Task +pulls the most relevant embeddings for this question from the development +database (which has data from the `seed` Rake Task) and saves those embeddings +in `ee/spec/fixtures/vertex_embeddings`. This fixture is used in tests related +to embeddings. + +If you would like to change any of the questions supported in embeddings specs, +update and re-run the `extract_embeddings` Rake Task. + +In the specs where you need to use the embeddings, +use the RSpec `:ai_embedding_fixtures` metadata. + +```ruby +context 'when asking about how to use GitLab', :ai_embedding_fixtures do + # ...examples +end +``` + +## Monitoring + +- Error ratio and response latency apdex for each Ai action can be found on [Sidekiq Service dashboard](https://dashboards.gitlab.net/d/sidekiq-main/sidekiq-overview?orgId=1) under **SLI Detail: `llm_completion`**. +- Spent tokens, usage of each Ai feature and other statistics can be found on [periscope dashboard](https://app.periscopedata.com/app/gitlab/1137231/Ai-Features). ## Security Refer to the [secure coding guidelines for Artificial Intelligence (AI) features](../secure_coding_guidelines.md#artificial-intelligence-ai-features). + +## Tips for local development + +1. When responses are taking too long to appear in the user interface, consider restarting Sidekiq by running `gdk restart rails-background-jobs`. If that doesn't work, try `gdk kill` and then `gdk start`. +1. Alternatively, bypass Sidekiq entirely and run the service synchronously. This can help with debugging errors as GraphQL errors are now available in the network inspector instead of the Sidekiq logs. To do that temporary alter `perform_for` method in `Llm::CompletionWorker` class by changing `perform_async` to `perform_inline`. diff --git a/doc/development/cloud_connector/index.md b/doc/development/cloud_connector/index.md index c705e106e9b2b81b23ed5d00ccf354d3ca2775cb..097c9387d49639223616f035114b4d929ebfff0a 100644 --- a/doc/development/cloud_connector/index.md +++ b/doc/development/cloud_connector/index.md @@ -255,4 +255,4 @@ and assign it to the Cloud Connector group. ## Testing -An example for how to set up an end-to-end integration with the AI gateway as the backend service can be found [here](../ai_features/index.md#setup). +An example for how to set up an end-to-end integration with the AI gateway as the backend service can be found [here](../ai_features/index.md#local-setup).