Saturday, December 5, 2015

Reliable SFTP file upload without duplicates

SFTP is not the most fancy technology, but there are quite many areas where it's used. Some systems use it for exchanging messages. In this scenario one system generates a file and uploads it to SFTP. Another system scans SFTP for new files and processes them. Processed files are removed from incoming folder. Simple, unless you start thinking about error handling.

File upload can be easily interrupted. In this case remote system will pick up partially uploaded file. If you're lucky, remote system is smart enough to detect corrupted files and handle them accordingly, but what if not? Trick is to use temporary file.

Temporary file is not picked up by remote system. If upload crashes, we can resume it or even restart whole upload. When we're sure that temporary file was uploaded, it's time to rename or move it to proper location. Now remote system can pick it up.

Let's imagine that remote system is even less reliable and does not recognize duplicate files. Now we need to make sure that we will never rename temporary file to proper name twice. Unfortunately, if network will drop off before we will receive answer from rename operation. We can verify state by checking whether temporary file still exists and retry renaming if it does.

It is important to persist fact of successful upload of temp file. In case of transactional persistence, make sure that it won't be rolled back.

Here is a flow that worked well for us:


After several months in production we did not have any issues with that solution.

No comments:

Post a Comment