Introduction to Data Catalog in Kosmoy Studio
Learn how to use the Data Catalog to create a structured inventory of your data sources within Kosmoy Studio.
Working with the Data Catalog
The Data Catalog within Kosmoy Studio provides a centralized and organized way to manage metadata about your data sources. It allows you to create a structured inventory of your databases, collections, object stores, and folders, making it easy to discover, understand, and utilize your data assets within the Kosmoy platform.
Important Note: The Data Catalog does not store the actual data. Instead, it creates metadata representations that point to the actual location of your data sources. This means that your data remains securely stored in its original location, and Kosmoy Studio simply maintains references to it for organizational and access control purposes.
Key Components of the Data Catalog:
- Databases: Register your SQL or Vector databases, providing a structured view of their contents. Requires a pre-existing Connection.
- Collections: Define specific collections within your registered Vector databases, representing subsets of data relevant to your AI applications.
- Object Stores: Register your object storage services (such as Amazon S3, Azure Blob Storage, or Google Cloud Storage) to manage unstructured data. Requires a pre-existing Integration with a multi-service cloud provider.
- Folders: Define specific folders within your registered object stores, allowing for granular control over data access.
Benefits of Using the Data Catalog:
- Centralized Data Discovery: Easily find and understand your available data sources within a single interface.
- Simplified Data Access: Streamline the process of connecting your AI applications to relevant data.
- Enhanced Collaboration: Provide a shared understanding of your data assets across your team.
- Improved Data Governance: Maintain a clear overview of your data sources and control access through Kosmoy’s role-based permissions.
Prerequisites:
Before adding data sources to the Data Catalog, you need to create the necessary Connections (for databases) or Integrations (for object stores) with your data providers. Refer to the Managing Connections and Managing Integrations sections for details.
In the following sections, you will learn how to:
- Register and manage Databases in the Data Catalog.
- Define and manage Collections within your Vector databases.
- Register and manage Object Stores.
- Define and manage specific Folders within your Object Stores.
By effectively utilizing the Data Catalog, you can unlock the full potential of your data within the Kosmoy platform, enabling you to build powerful and data-driven AI applications.