How can I upload larger files? How can I upload faster?
You may (sometimes have to) upload files in parts. This means instead of opening a single HTTP connection to transfer the whole binary to the REST API, you open multiple connections. Each HTTP connection ships part of the file.
Using multipart uploads enables the following features.
Increasing the upload speed by uploading the parts concurrently.
Pausing the upload.
Multipart upload is a must for uploads which require HTTP connection longer than the REST API timeouts. Multipart upload is recommended for uploading files that are larger than 5 MB. That size also constitutes the minimum size of the upload part (the last chunk may be smaller). The maximum number of chunks is 10,000. The sizes of chunks must be equal (the exception is the last chunk which may be smaller). The chunk size is defined with the first upload chunk, based on Content-Range header.
Example 3. Multipart upload
// Get total file sizeconststats=awaitfs.promises.stat(FILE_PATH);constfileSize=stats.size;// Define function for uploading single chunk from the fileconstuploadChunk=async (start:number, end:number, headers:HeadersInit= {}) => {constchunkStream=fs.createReadStream('./path/to/file', { start: start, end: end });constchunk=awaitarrayBuffer(chunkStream)constresponse=awaitfetch(`https://api.akord.com/files?tags=${tagsBase64Encoded}`, { method:'POST', headers: {'Api-Key':'your_api_key','Content-Type':'application/pdf','Content-Range':`bytes ${start}-${end}/${fileSize}`,...headers }, body: chunk });if (response.status !==202) {thrownewError('Failed to upload first chunk of the file. Status code: '+response.status); }return response;}// Upload first chunk of the fileconstresponse=awaituploadChunk(0,CHUNK_SIZE);// Read location of the multipart uploadconstcontentLocation=response.headers.get('Content-Location');if (!contentLocation) {thrownewError('Content-Location header is missing');}// Upload middle chunks of the file using 'Content-Location' & 'Content-Range' - can be done concurrentlylet sourceOffset =CHUNK_SIZE;constchunkUploadPromises= [];while (sourceOffset +CHUNK_SIZE< fileSize) { const chunkUploadPromise = uploadChunk(sourceOffset, sourceOffset + CHUNK_SIZE, { 'Content-Location': contentLocation });
chunkUploadPromises.push(chunkUploadPromise); sourceOffset +=CHUNK_SIZE;}awaitPromise.all(chunkUploadPromises);// Upload last chunk of the file to complete the multipart uploadconstres=awaituploadChunk(sourceOffset, fileSize, { 'Content-Location': contentLocation });
import osimport requests# Get total file sizefile_path ='./tests/data/20mb.pdf'file_size = os.path.getsize(file_path)# Define function to upload a chunkdefupload_chunk(start,end,headers=None):withopen(file_path, 'rb')as file: file.seek(start) chunk = file.read(end - start +1) headers = headers or{} headers['Content-Range']=f'bytes {start}-{end}/{file_size}' headers['Content-Type']='application/pdf' headers['Authorization']='Bearer <your_access_token>' response = requests.post(f'{os.environ["BASE_URL"]}/files?tags={tags_base64_encoded}', data=chunk, headers=headers)print("Uploaded chunk:", f'bytes {start}-{end}/{file_size}')if response.status_code !=202:raiseValueError('Failed to upload chunk of the file. Status code: '+str(response.status_code))return response# Upload first chunk of the fileresponse =upload_chunk(0, CHUNK_SIZE)# Read location of the multipart uploadcontent_location = response.headers.get('Content-Location')ifnot content_location:raiseValueError('Content-Location header is missing')# Upload middle chunks of the file using 'Content-Location' & 'Content-Range' - can be done concurrentlysource_offset = CHUNK_SIZEwhile source_offset + CHUNK_SIZE < file_size:upload_chunk(source_offset, source_offset + CHUNK_SIZE, {'Content-Location': content_location}) source_offset += CHUNK_SIZE# Upload last chunk of the file to complete the multipart uploadupload_chunk(source_offset, file_size, {'Content-Location': content_location})