|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
Directly streaming data to a S3ObjectHi All, I have Th Mathias Stümpert |
|
|
Re: Directly streaming data to a S3ObjectHi Mathias,
I'm afraid it may be difficult to do what you want. It should be possible to stream data directly into an S3Object using pipes to have it uploaded to S3 as you receive the data, but you are likely to run into a few problems which may make it difficult to achieve reliable uploads. First, you must know the exact size of the data before you commence an upload to S3. The content length header in the PUT request message must specify the number of bytes S3 will receive, and if this value is not exactly correct the upload will fail. Another issue is that S3 has a very low tolerance for paused connections. If you are sending data to S3 and the upload pauses for more than 9 to 10 seconds, S3 will terminate the connection from their end. Unless you can be sure that you will be able to receive and relay data at a fairly constant rate without any hiccups, you may suffer from frequent dropped connections. A final potential problem is that S3 uploads will occasionally fail with 500 Internal Server errors, even when the service is working properly. When this happens, you will have to recommence the whole upload and start streaming the data from the beginning. Unless you have the data cached locally, this will involve terminating the original FTP connection and starting a new one from the beginning. In short, it should be possible to relay data rather than caching it locally but you are likely to come across a number difficulties. These issues will make your application a lot more complicated, as it will have to handle a lot of error conditions. Depending on your situation, it may not be worth the effort. Faced with similar issues, I have taken the easy way out and simply stored the data in temporary files. My only suggestion would be to cache the FTP data in small files, and send these up to S3 as they arrive. This would allow you to simplify the process while avoiding caching all the data yourself. However, this approach will not work if you need to keep the files in one piece. James On Sat, Mar 1, 2008 at 12:26 AM, Stuempert, Mathias IWR <mathias.stuempert@...> wrote:
-- http://www.jamesmurty.com |
|
|
|
|
|
Re: Directly streaming data to a S3ObjectHi Mathias,
I'm glad my gloomy email didn't dissuade you from trying this, and that your piping approach is working well. That's excellent news. And thanks for posting the example code. If your network connections are fast and reliable you should not experience most of the problems I was worried about. The most important thing is that the data you receive via FTP to relay/pipe to S3 must arrive at a rate greater or equal to the S3 upload speed, to avoid any pauses in the upload. Even if everything else works 100% you will still get the occasional 500 Internal Server Error from S3. By default, the JetS3t REST implementation will retry uploads that fail with this error. However, because you are providing data in an InputStream, the toolkit will only be able to retry the upload if it fails early in the attempt and has not sent too much data. If you have sent too much data you will get UnrecoverableIOException errors with a message like "Input stream is not repeatable as 247021 bytes have been written, exceeding the available buffer size of 131072" There are three ways you could deal with this problem: - Turn off retries for 500 Internal Server errors (jets3t.properties: s3service.internal-error-retry-max=0), catch the S3ServiceException when this happens, and retry the upload yourself - Allow JetS3t to retry uploads when it can, but catch the UnrecoverableIOException when it can't and retry the upload yourself - Implement your own input stream class that can be reset to repeat the input data from the beginning. The RepeatableFileInputStream class gives a good example for repeating files, you could do the same thing for your FTP connections. If you do this, JetS3t will be able to handle the retries for you. This info is probably much more in-depth than you need right now, but if the 500 errors start causing you grief you will have some options. It's something you will have to deal with if you need your application to be very robust. Cheers, James On Mon, Mar 3, 2008 at 11:58 PM, Stuempert, Mathias IWR <mathias.stuempert@...> wrote:
-- http://www.jamesmurty.com |
| Free embeddable forum powered by Nabble | Forum Help |