Working with Collections

The Collections section within the Data Catalog allows you to define and manage specific subsets of data within your registered Vector Databases. Collections are essential for organizing your vector data and making it accessible to other Kosmoy Studio components, such as Data Channels and Vector Pipelines.

Prerequisites:

Before creating a Collection, you must have a Vector Database registered in the Data Catalog. Refer to the Databases section for details.

Creating a Collection

  1. Navigate to Collections: From the Kosmoy Studio home page, click on the “Data Management” menu in the left-hand navigation bar, then select “Data Catalog”, and then click on “Collections”.

  2. Add a New Collection: Click the ”+ ADD” button located in the upper right corner of the Collections section.

  3. Select Vector Database: From the dropdown menu, select the pre-configured Vector Database in which you want to create the collection.

  4. Define Distance Strategy: Choose the appropriate distance strategy for your collection. The available options are:

    • Euclidean: Measures the straight-line distance between two vectors.
    • Cosine: Measures the cosine of the angle between two vectors, indicating their directional similarity.
    • Max Inner Product: Calculates the inner product between two vectors.

    Note: The choice of distance strategy depends on the nature of your data and the specific requirements of your application. You should have configured it when creating your Vector Database.

  5. Configure Collection Connection Parameters: Enter the required connection parameters for your specific Vector Database type. These parameters may vary depending on the database provider.

  6. Click “Next”: Proceed to the next step.

  7. Name and Describe the Collection: Give your collection a unique name and an optional description.

  8. Click “Review”: Review the collection configuration.

  9. Click “Create”: Create the collection.

Collection Cards

The Collections section displays each created collection as a card. Each card shows:

  • Collection Icon: An icon representing a collection.
  • Collection Name: The name you assigned to the collection.
  • Description: The description you provided for the collection.
  • Edit Icon (Pencil): Click this icon to update the collection’s name or description (only if the collection is not in use).
  • Delete Icon (Trash Bin): Click this icon to remove the collection (only if the collection is not in use).

Collection Usage Restrictions

You cannot edit or delete a collection if it is currently referenced by other entities within Kosmoy Studio. This includes being used in:

  • Data Channels
  • Vector Pipelines
  • Any other Kosmoy Studio component that references collections.

Before attempting to edit or delete a collection, ensure it is not actively used in any of these areas.

Updating a Collection

You can update the name and description of a registered collection, provided it is not currently referenced by any other component.

  1. Navigate to Collections: Go to “Data Management” > “Data Catalog” > “Collections”.
  2. Locate the Collection Card: Find the card for the collection you want to update.
  3. Click the Edit (Pencil) Icon: This will open the update dialog.
  4. Modify Name and/or Description: Update the collection’s name and/or description as needed.
  5. Click “Save”: Save the changes.

Note: You cannot modify the core configuration of a collection (database, distance strategy, connection parameters) after it has been created.

Removing a Collection

You can remove a registered collection if it’s no longer needed. However, you cannot delete a collection that is currently referenced by any other component.

  1. Navigate to Collections: Go to “Data Management” > “Data Catalog” > “Collections”.
  2. Locate the Collection Card: Find the card for the collection you want to remove.
  3. Click the Delete (Trash Bin) Icon: This will trigger a confirmation prompt.

If you attempt to delete a collection that is currently in use, a modal will appear, preventing the deletion and explaining that the collection is in use.

  1. Confirm Deletion: Confirm that you want to delete the collection.

Warning: Deleting a collection is a permanent action and cannot be undone. Ensure that the collection is not being referenced by any other component before proceeding.

Using Collections

Collections are primarily used in two ways within Kosmoy Studio:

  • Data Channels: When creating a Vector Data Channel, you need to select a specific collection as the data source. This allows your AI Assistants to access and utilize the vector data within that collection.
  • Vector Pipelines: When configuring a Vector Pipeline, you need to specify a target collection where the generated vector embeddings will be stored.

This section provides a detailed guide to creating and managing Collections within Kosmoy Studio’s Data Catalog.