Resource Hierarchy
Core Resources
Workspace
Workspace
The top-level organizational container.Workspaces group templates, sessions, and team members together. Each workspace has its own set of users with specific roles (admin, user, readonly).
One workspace can contain: Multiple templates, unlimited sessions, multiple team members
| Field | Description |
|---|---|
id | Unique identifier |
name | Display name |
icon | Emoji or icon |
Extraction Template
Extraction Template
Defines what data to extract.Templates contain the JSON schema that specifies which fields to extract from documents. Each template belongs to one workspace and can be used across many sessions.
Key relationships:
| Field | Description |
|---|---|
id | UUID identifier |
name | Template name |
description | Optional description |
schema_json | Extraction schema definition |
settings | Parsing configuration |
workspace_id | Parent workspace |
- Belongs to one Workspace
- Used by many Sessions
Extraction Session
Extraction Session
An individual extraction job.Sessions are where actual extraction happens. You upload documents to a session, run the extraction, and retrieve results. Each session uses one template.
Key relationships:
| Field | Description |
|---|---|
id | UUID identifier |
name | Session name |
extraction_template_id | Template to use |
status | pending, processing, completed, failed |
- Uses one Template
- Contains many Documents
- Produces Results
Document
Document
A source file for extraction.Documents are PDFs, images, or other files uploaded to a session. After upload, documents are automatically parsed into chunks for extraction.
Key relationships:
| Field | Description |
|---|---|
id | UUID identifier |
filename | Original filename |
mime_type | File type |
size | Size in bytes |
status | pending, parsing, parsed, failed |
- Belongs to one Session
- Contains many Chunks (after parsing)
Chunk
Chunk
A parsed segment of a document.When documents are processed, they’re split into chunks — meaningful segments of text with page references. The AI uses these chunks to find and extract data.
Key relationships:
| Field | Description |
|---|---|
id | UUID identifier |
content | Text content |
page_number | Source page |
chunk_index | Order within document |
metadata | Additional parsing info |
- Belongs to one Document
- Referenced in extraction Results
Extraction Result
Extraction Result
The extracted data output.Results contain the structured data extracted from session documents according to the template schema. Each result includes the extracted values and optionally AI reasoning traces.
Key relationships:
| Field | Description |
|---|---|
id | UUID identifier |
status | pending, processing, completed, failed |
data | Extracted values |
reasoning | AI reasoning (if enabled) |
- Belongs to one Session
- References source Documents/Chunks
Typical Workflow
1
Create a Workspace
Set up a workspace for your project or team. Invite collaborators if needed.
2
Design an Extraction Template
Define what data you want to extract using the JSON schema format. Include field definitions, search queries, and extraction prompts.
3
Create an Extraction Session
For each batch of documents you want to process, create a session linked to your template.
4
Upload Documents
Add your source documents (PDFs, images, etc.) to the session. Documents are automatically parsed into chunks.
5
Run Extraction
Execute the extraction. The AI searches relevant chunks and extracts data according to your schema.
6
Retrieve Results
Access the structured extraction results via API or export to Excel/CSV.
Resource Limits
| Resource | Limit |
|---|---|
| Workspaces per user | Based on plan |
| Templates per workspace | Unlimited |
| Sessions per template | Unlimited |
| Documents per session | 100 |
| File size | 50 MB |
Limits may vary based on your subscription plan. Contact support for enterprise limits.
