This guide explains how to upload documents to Raydocs for extraction using the API. The process involves three steps: getting a signed URL, uploading the file to temporary storage, then associating it with an extraction session.
Looking for the API reference? See Create Document for the endpoint specification.
Upload Flow Overview
Step 1: Get a Signed Upload URL
First, request a signed URL from the Vapor storage endpoint. This URL allows you to upload directly to S3 without routing the file through the API server.
POST /vapor/signed-storage-url HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>
Content-Type: application/json
{
"visibility": "private"
}
Request Parameters
| Parameter | Type | Required | Description |
|---|
visibility | string | No | Set to private (default) |
The content_type parameter is optional and defaults to
application/octet-stream. You don’t need to detect or specify file types.
Response
{
"uuid": "abc123-def456-ghi789",
"key": "tmp/abc123-def456-ghi789",
"url": "https://s3.amazonaws.com/bucket/tmp/abc123...?X-Amz-Signature=...",
"headers": {
"Content-Type": "application/octet-stream"
}
}
The signed URL is valid for a limited time (typically 5 minutes). Upload
your file promptly after receiving it.
Step 2: Upload to S3
Use the signed URL to upload your file directly to S3. Include the headers returned in the previous step.
curl -X PUT "${SIGNED_URL}" \
--data-binary @document.pdf
const response = await fetch(signedUrl, {
method: 'PUT',
headers: headers, // Use headers from Step 1 response
body: fileBuffer
});
if (response.ok) {
console.log('Upload successful');
}
import requests
with open('document.pdf', 'rb') as f:
response = requests.put(
signed_url,
headers=upload_data.get('headers', {}), # Use headers from Step 1
data=f
)
if response.status_code == 200:
print('Upload successful')
Step 3: Associate with Session
After the file is uploaded to S3, create a document record and associate it with your extraction session. Use the key from Step 1 to reference the uploaded file.
POST /extractions/sessions/{sessionId}/documents HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>
Content-Type: application/json
{
"key": "tmp/abc123-def456-ghi789",
"filename": "invoice_001.pdf"
}
Request Parameters
| Parameter | Type | Required | Description |
|---|
key | string | Yes | The key returned from the signed URL request |
filename | string | Yes | Original filename for display purposes |
Response
{
"id": "880e8400-e29b-41d4-a716-446655440000",
"filename": "invoice_001.pdf",
"status": "uploaded",
"extraction_session_id": "770e8400-e29b-41d4-a716-446655440000",
"meta": {},
"created_at": "2024-01-15T10:30:00Z"
}
Document processing starts automatically after association. The status will
progress from uploaded → processing → ready.
Complete Example
Here’s a complete example in JavaScript:
async function uploadDocument(sessionId, file, apiToken) {
// Step 1: Get signed URL
const signedUrlResponse = await fetch(
"https://api.raydocs.com/vapor/signed-storage-url",
{
method: "POST",
headers: {
Authorization: `Bearer ${apiToken}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
visibility: "private",
}),
}
);
const { url, key, headers } = await signedUrlResponse.json();
// Step 2: Upload to S3
await fetch(url, {
method: "PUT",
headers: headers,
body: file,
});
// Step 3: Associate with session
const documentResponse = await fetch(
`https://api.raydocs.com/extractions/sessions/${sessionId}/documents`,
{
method: "POST",
headers: {
Authorization: `Bearer ${apiToken}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
key: key,
filename: file.name,
}),
}
);
return documentResponse.json();
}
PDF, images (PNG, JPEG, TIFF), and Office documents (DOCX, PPTX) are supported.
Monitoring Processing Status
After uploading, poll the document endpoint to check processing status:
GET /extractions/documents/{documentId} HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>
Or list all documents in the session:
GET /extractions/sessions/{sessionId}/documents HTTP/1.1
Host: api.raydocs.com
Authorization: Bearer <token>
Wait for all documents to reach ready status before running an extraction.
Extracting with processing documents will fail.
Error Handling
Common Upload Errors
| Error | Cause | Solution |
|---|
403 Forbidden on S3 | Signed URL expired | Request a new signed URL |
413 Payload Too Large | File exceeds size limit | Compress or split the document |
422 Unprocessable Entity | Invalid key | Ensure you’re using the key from Step 1 |
Processing Failures
If a document’s status becomes failed:
- Check the document’s error message via the GET endpoint
- Verify the file is a valid, non-corrupted document
- Re-upload if the file was damaged during transfer