by Jason Gilman
Recently, I needed to implement large file uploads to Amazon Web Services (AWS) from a web browser. We had some requirements with possible solutions:
- Fast – The uploads should be as fast as possible to limit the amount of time a user spent waiting.
- Secure – Only authenticated users can upload the file.
- Reliable – Even multi-gigabyte files should work without timeouts or costing a lot.
S3 is the ideal place to upload files but there are issues with uploading from a web browser to S3. You don’t want to give users free reign to upload whatever they want to S3. The buckets should have permissions in place to prevent unauthorized access. You also can’t put any kind of credentials in your website code because it could be easily extracted.
One possible solution is sending the file through a Lambda proxy, but that can be problematic. This is expensive as you pay for the Lambda to run during the entire file upload. Large files could exceed Lambda memory resources or cause timeouts.
Signed URLs for S3 Uploads
Time-limited, signed S3 URLs are allowing secured, direct access to S3. Signed URLs only allow permission to upload a specific file with a specific content type. The URL can also be set to expire at a specific time.
This sequence diagram shows how that would work.
Issues with only using Signed URLs
This solution requires the browser to send the file in a single stream. If you want to upload large files from a user’s browser you should use the S3 Multipart Upload capability. Your client code sends parts of the file concurrently decreasing the amount of time it will take.
Multipart Uploads with Signed URLs
Each part of the file uploaded requires a separate signed URL. I was able to figure out how to do this thanks to a coworker who found this repository. It has a good set of example code for both the API and the Browser side. Though there’s problem with the example code I’ll note at the end.
Important Note: Limit number of parallel uploads
The example code and my diagram above upload all of the parts of the file in parallel. Large files can result in too many concurrent uploads for the browser to handle. You should limit the number of parallel uploads by only uploading a set of the chunks at a time to S3. 10 concurrent uploads worked without issue but still provided reasonable upload times. Do your own testing to see what works for your use case.