Uploading Documents

This guide explains how to upload documents to Raydocs for extraction using the API. The process involves two steps: uploading the file to temporary storage, then associating it with an extraction session.

Upload Flow Overview

Step 1: Get a Signed Upload URL

First, request a signed URL from the Vapor storage endpoint. This URL allows you to upload directly to S3 without routing the file through the API server.

POST /vapor/signed-storage-url HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>
Content-Type: application/json

{
  "visibility": "private"
}

Request Parameters

Parameter	Type	Required	Description
`visibility`	string	No	Set to `private` (default)

The content_type parameter is optional and defaults to application/octet-stream. You don’t need to detect or specify file types.

Response

{
    "uuid": "abc123-def456-ghi789",
    "key": "tmp/abc123-def456-ghi789",
    "url": "https://s3.amazonaws.com/bucket/tmp/abc123...?X-Amz-Signature=...",
    "headers": {
        "Content-Type": "application/octet-stream"
    }
}

The signed URL is valid for a limited time (typically 5 minutes). Upload your file promptly after receiving it.

Step 2: Upload to S3

Use the signed URL to upload your file directly to S3. Include the headers returned in the previous step.

cURL
JavaScript
Python

curl -X PUT "${SIGNED_URL}" \
  --data-binary @document.pdf

const response = await fetch(signedUrl, {
  method: 'PUT',
  headers: headers,  // Use headers from Step 1 response
  body: fileBuffer
});

if (response.ok) {
  console.log('Upload successful');
}

import requests

with open('document.pdf', 'rb') as f:
    response = requests.put(
        signed_url,
        headers=upload_data.get('headers', {}),  # Use headers from Step 1
        data=f
    )

if response.status_code == 200:
    print('Upload successful')

Step 3: Associate with Session

After the file is uploaded to S3, create a document record and associate it with your extraction session. Use the key from Step 1 to reference the uploaded file.

POST /extractions/sessions/{sessionId}/documents HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>
Content-Type: application/json

{
  "key": "tmp/abc123-def456-ghi789",
  "filename": "invoice_001.pdf"
}

Request Parameters

Parameter	Type	Required	Description
`key`	string	Yes	The `key` returned from the signed URL request
`filename`	string	Yes	Original filename for display purposes

Response

{
    "id": "880e8400-e29b-41d4-a716-446655440000",
    "filename": "invoice_001.pdf",
    "status": "uploaded",
    "extraction_session_id": "770e8400-e29b-41d4-a716-446655440000",
    "meta": {},
    "created_at": "2024-01-15T10:30:00Z"
}

Document processing starts automatically after association. The status will progress from uploaded → processing → ready.

Complete Example

Here’s a complete example in JavaScript:

async function uploadDocument(sessionId, file, apiToken) {
    // Step 1: Get signed URL
    const signedUrlResponse = await fetch(
        "https://api.raydocs.com/vapor/signed-storage-url",
        {
            method: "POST",
            headers: {
                Authorization: `Bearer ${apiToken}`,
                "Content-Type": "application/json",
            },
            body: JSON.stringify({
                visibility: "private",
            }),
        }
    );

    const { url, key, headers } = await signedUrlResponse.json();

    // Step 2: Upload to S3
    await fetch(url, {
        method: "PUT",
        headers: headers,
        body: file,
    });

    // Step 3: Associate with session
    const documentResponse = await fetch(
        `https://api.raydocs.com/extractions/sessions/${sessionId}/documents`,
        {
            method: "POST",
            headers: {
                Authorization: `Bearer ${apiToken}`,
                "Content-Type": "application/json",
            },
            body: JSON.stringify({
                key: key,
                filename: file.name,
            }),
        }
    );

    return documentResponse.json();
}

Supported File Formats

PDF, images (PNG, JPEG, TIFF), and Office documents (DOCX, PPTX) are supported.

Monitoring Processing Status

After uploading, poll the document endpoint to check processing status:

GET /extractions/documents/{documentId} HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>

Or list all documents in the session:

GET /extractions/sessions/{sessionId}/documents HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>

Wait for all documents to reach ready status before running an extraction. Extracting with processing documents will fail.

Error Handling

Common Upload Errors

Error	Cause	Solution
`403 Forbidden` on S3	Signed URL expired	Request a new signed URL
`413 Payload Too Large`	File exceeds size limit	Compress or split the document
`422 Unprocessable Entity`	Invalid key	Ensure you’re using the key from Step 1

Processing Failures

If a document’s status becomes failed:

Check the document’s error message via the GET endpoint
Verify the file is a valid, non-corrupted document
Re-upload if the file was damaged during transfer

Overview

Cookbook

Workspaces

Workspace Users

Extraction Templates

Extraction Sessions

Batch Operations

Documents

Results

Uploading Documents

Upload Flow Overview

Step 1: Get a Signed Upload URL

Request Parameters

Response

Step 2: Upload to S3

Step 3: Associate with Session

Request Parameters

Response

Complete Example

Supported File Formats

Monitoring Processing Status

Error Handling

Common Upload Errors

Processing Failures

Overview

Cookbook

Workspaces

Workspace Users

Extraction Templates

Extraction Sessions

Batch Operations

Documents

Results

​Upload Flow Overview

​Step 1: Get a Signed Upload URL

​Request Parameters

​Response

​Step 2: Upload to S3

​Step 3: Associate with Session

​Request Parameters

​Response

​Complete Example

​Supported File Formats

​Monitoring Processing Status

​Error Handling

​Common Upload Errors

​Processing Failures

Upload Flow Overview

Step 1: Get a Signed Upload URL

Request Parameters

Response

Step 2: Upload to S3

Step 3: Associate with Session

Request Parameters

Response

Complete Example

Supported File Formats

Monitoring Processing Status

Error Handling

Common Upload Errors

Processing Failures