This guide explains how to upload documents to Raydocs for extraction using the API. The process involves two steps: uploading the file to temporary storage, then associating it with an extraction session.
Upload Flow Overview
Step 1: Get a Signed Upload URL
First, request a signed URL from the Vapor storage endpoint. This URL allows you to upload directly to S3 without routing the file through the API server.
POST /vapor/signed-storage-url HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>
Content-Type: application/json
{
"visibility": "private"
}
Request Parameters
| Parameter | Type | Required | Description |
|---|
visibility | string | No | Set to private (default) |
The content_type parameter is optional and defaults to
application/octet-stream. You don’t need to detect or specify file types.
Response
{
"uuid": "abc123-def456-ghi789",
"key": "tmp/abc123-def456-ghi789",
"url": "https://s3.amazonaws.com/bucket/tmp/abc123...?X-Amz-Signature=...",
"headers": {
"Content-Type": "application/octet-stream"
}
}
The signed URL is valid for a limited time (typically 5 minutes). Upload
your file promptly after receiving it.
Step 2: Upload to S3
Use the signed URL to upload your file directly to S3. Include the headers returned in the previous step.
curl -X PUT "${SIGNED_URL}" \
--data-binary @document.pdf
const response = await fetch(signedUrl, {
method: 'PUT',
headers: headers, // Use headers from Step 1 response
body: fileBuffer
});
if (response.ok) {
console.log('Upload successful');
}
import requests
with open('document.pdf', 'rb') as f:
response = requests.put(
signed_url,
headers=upload_data.get('headers', {}), # Use headers from Step 1
data=f
)
if response.status_code == 200:
print('Upload successful')
Step 3: Associate with Session
After the file is uploaded to S3, create a document record and associate it with your extraction session. Use the key from Step 1 to reference the uploaded file.
POST /extractions/sessions/{sessionId}/documents HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>
Content-Type: application/json
{
"key": "tmp/abc123-def456-ghi789",
"filename": "invoice_001.pdf"
}
Request Parameters
| Parameter | Type | Required | Description |
|---|
key | string | Yes | The key returned from the signed URL request |
filename | string | Yes | Original filename for display purposes |
Response
{
"id": "880e8400-e29b-41d4-a716-446655440000",
"filename": "invoice_001.pdf",
"status": "uploaded",
"extraction_session_id": "770e8400-e29b-41d4-a716-446655440000",
"meta": {},
"created_at": "2024-01-15T10:30:00Z"
}
Document processing starts automatically after association. The status will
progress from uploaded → processing → ready.
Complete Example
Here’s a complete example in JavaScript:
async function uploadDocument(sessionId, file, apiToken) {
// Step 1: Get signed URL
const signedUrlResponse = await fetch(
"https://api.raydocs.com/vapor/signed-storage-url",
{
method: "POST",
headers: {
Authorization: `Bearer ${apiToken}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
visibility: "private",
}),
}
);
const { url, key, headers } = await signedUrlResponse.json();
// Step 2: Upload to S3
await fetch(url, {
method: "PUT",
headers: headers,
body: file,
});
// Step 3: Associate with session
const documentResponse = await fetch(
`https://api.raydocs.com/extractions/sessions/${sessionId}/documents`,
{
method: "POST",
headers: {
Authorization: `Bearer ${apiToken}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
key: key,
filename: file.name,
}),
}
);
return documentResponse.json();
}
PDF, images (PNG, JPEG, TIFF), and Office documents (DOCX, PPTX) are supported.
Monitoring Processing Status
After uploading, poll the document endpoint to check processing status:
GET /extractions/documents/{documentId} HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>
Or list all documents in the session:
GET /extractions/sessions/{sessionId}/documents HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>
Wait for all documents to reach ready status before running an extraction.
Extracting with processing documents will fail.
Error Handling
Common Upload Errors
| Error | Cause | Solution |
|---|
403 Forbidden on S3 | Signed URL expired | Request a new signed URL |
413 Payload Too Large | File exceeds size limit | Compress or split the document |
422 Unprocessable Entity | Invalid key | Ensure you’re using the key from Step 1 |
Processing Failures
If a document’s status becomes failed:
- Check the document’s error message via the GET endpoint
- Verify the file is a valid, non-corrupted document
- Re-upload if the file was damaged during transfer