Skip to main content
Raydocs organizes document extraction into a clear hierarchy. Understanding this structure helps you design effective workflows and use the API efficiently.

Resource Hierarchy

Core Resources

The top-level organizational container.Workspaces group templates, sessions, and team members together. Each workspace has its own set of users with specific roles (admin, user, readonly).
FieldDescription
idUnique identifier
nameDisplay name
iconEmoji or icon
One workspace can contain: Multiple templates, unlimited sessions, multiple team members
Defines what data to extract.Templates contain the JSON schema that specifies which fields to extract from documents. Each template belongs to one workspace and can be used across many sessions.
FieldDescription
idUUID identifier
nameTemplate name
descriptionOptional description
schema_jsonExtraction schema definition
settingsParsing configuration
workspace_idParent workspace
Key relationships:
  • Belongs to one Workspace
  • Used by many Sessions
An individual extraction job.Sessions are where actual extraction happens. You upload documents to a session, run the extraction, and retrieve results. Each session uses one template.
FieldDescription
idUUID identifier
nameSession name
extraction_template_idTemplate to use
statuspending, processing, completed, failed
Key relationships:
  • Uses one Template
  • Contains many Documents
  • Produces Results
A source file for extraction.Documents are PDFs, images, or other files uploaded to a session. After upload, documents are automatically parsed into chunks for extraction.
FieldDescription
idUUID identifier
filenameOriginal filename
mime_typeFile type
sizeSize in bytes
statuspending, parsing, parsed, failed
Key relationships:
  • Belongs to one Session
  • Contains many Chunks (after parsing)
A parsed segment of a document.When documents are processed, they’re split into chunks — meaningful segments of text with page references. The AI uses these chunks to find and extract data.
FieldDescription
idUUID identifier
contentText content
page_numberSource page
chunk_indexOrder within document
metadataAdditional parsing info
Key relationships:
  • Belongs to one Document
  • Referenced in extraction Results
The extracted data output.Results contain the structured data extracted from session documents according to the template schema. Each result includes the extracted values and optionally AI reasoning traces.
FieldDescription
idUUID identifier
statuspending, processing, completed, failed
dataExtracted values
reasoningAI reasoning (if enabled)
Key relationships:
  • Belongs to one Session
  • References source Documents/Chunks

Typical Workflow

1

Create a Workspace

Set up a workspace for your project or team. Invite collaborators if needed.
2

Design an Extraction Template

Define what data you want to extract using the JSON schema format. Include field definitions, search queries, and extraction prompts.
See the Extraction Schema Guide for detailed schema documentation.
3

Create an Extraction Session

For each batch of documents you want to process, create a session linked to your template.
4

Upload Documents

Add your source documents (PDFs, images, etc.) to the session. Documents are automatically parsed into chunks.
5

Run Extraction

Execute the extraction. The AI searches relevant chunks and extracts data according to your schema.
6

Retrieve Results

Access the structured extraction results via API or export to Excel/CSV.

Resource Limits

ResourceLimit
Workspaces per userBased on plan
Templates per workspaceUnlimited
Sessions per templateUnlimited
Documents per session100
File size50 MB
Limits may vary based on your subscription plan. Contact support for enterprise limits.

API Navigation