File Transfer
Part of Networking
Moving files between computers — the most fundamental and common network application.
Why This Matters
File transfer is the original justification for building networks. Before email, before web browsing, before any other networked application, people needed to move files from one machine to another. Today, file transfer underpins nearly everything: software updates, backups, data sharing, media distribution, and document collaboration all depend on reliable file transfer mechanisms.
Understanding file transfer means understanding both the technical protocols that make it work and the practical considerations that determine which approach to use in different circumstances. In a reconstruction context, where bandwidth may be limited and reliability cannot be assumed, knowing how to move files efficiently and verifiably is essential operational knowledge.
File transfer is also a good entry point for understanding networked applications in general, because it exposes all the fundamental challenges: how to establish a connection, how to negotiate capabilities, how to handle errors, how to resume interrupted transfers, and how to verify that the received file matches the sent file.
Simple File Copy Methods
The simplest file transfer is a direct copy over a shared network filesystem. When two computers both mount the same network share (via NFS, SMB/CIFS, or a similar protocol), copying a file is identical to copying within a single machine — the operating system handles the network transfer transparently.
Shared filesystem access works well when both machines are nearby, connected by a fast and reliable link, and the transfer will complete in a single session. For larger files, longer distances, or less reliable links, shared filesystem copies have a significant weakness: if the connection is interrupted, the copy fails and must be restarted from the beginning. Partially transferred files are usually left in an incomplete state that may not be obvious until the application tries to use them.
For small files and controlled environments, direct copy is entirely adequate. For large files or unreliable links, more sophisticated approaches are needed.
FTP: File Transfer Protocol
FTP (File Transfer Protocol) is the original dedicated file transfer protocol, standardized in the 1970s and still widely used. FTP uses two TCP connections: a control connection for commands and a data connection for the actual file content. This separation allows FTP to handle large files without tying up the control channel.
FTP commands are text strings: LIST to show directory contents, GET to download a file, PUT to upload, CWD to change directory, and so forth. These commands make FTP transparent and easy to debug — you can connect to an FTP server with a plain terminal and issue commands manually to troubleshoot.
FTP’s weakness is security: it transmits both credentials (username and password) and file data in cleartext. Anyone who can observe the network traffic can see what files are being transferred and capture the passwords. In any environment where network traffic might be observed by untrusted parties, FTP should be replaced with SFTP (SSH File Transfer Protocol) or FTPS (FTP over TLS).
Active mode FTP has a firewall traversal problem: in active mode, the server initiates the data connection back to the client, which is blocked by most firewalls. Passive mode FTP has the client initiate both connections, which works better through firewalls. Most FTP clients default to passive mode, but if file transfers fail with some servers, switching to active mode may help.
SFTP and Secure Copy
SFTP is a file transfer protocol that runs over SSH. Despite the similar name, SFTP is not FTP over SSH — it is an entirely different protocol that was designed from the start to work within the SSH framework. SFTP provides the same functionality as FTP (directory listing, file upload/download, rename, delete) but with full encryption and authentication through SSH.
SCP (Secure Copy Protocol) is a simpler alternative to SFTP, also running over SSH. SCP provides only file transfer (no interactive browsing or file management) and is typically used as a command-line tool for scripted transfers. The syntax is similar to the Unix cp command: scp local_file user@remote_host:/remote/path to copy to a remote system.
For any environment where security matters, SFTP or SCP should be the default choice for file transfer. They require SSH to be running on the destination machine, which is standard on any Unix/Linux system and available for Windows.
rsync: Efficient Incremental Transfer
rsync is a file synchronization tool designed for efficient transfer of large files and directory trees. Its key feature is delta transfer: rsync computes which parts of a file have changed and transfers only those parts. For large files where only a small portion changes (database files, disk images, log files), rsync can reduce transfer size by 90% or more compared to copying the entire file.
rsync uses a rolling checksum algorithm to identify identical blocks between the source and destination files, even if the blocks have moved to different positions. This makes it effective for synchronizing files where content has shifted — such as files with new data prepended or inserted — not just appended.
For backups and synchronization, rsync is the standard tool. A typical backup command: rsync -avz --delete /source/directory/ user@backup_host:/backup/directory/ copies all files, preserves permissions and timestamps, compresses during transfer, and removes destination files that no longer exist in the source. The --delete flag is important for maintaining a true mirror — without it, deleted source files accumulate in the backup.
rsync also resumes interrupted transfers. If a large file transfer is interrupted, rsync can continue from where it left off on the next run, which is critical for transferring large files over unreliable connections.
Transfer Verification
Moving a file does not guarantee the file arrived correctly. Network errors, storage faults, and software bugs can all produce a received file that differs from the sent file. For important files, verification is essential.
The standard verification method is to compute a cryptographic hash of the original file, transfer the hash along with the file, and compute the hash of the received file. If the hashes match, the file is (with overwhelming probability) identical to the original. SHA-256 is the current standard for this purpose; MD5 is faster but no longer considered secure for all purposes.
Most file transfer tools provide hash verification as an option. FTP does not build in verification, so you must transfer the hash separately. SFTP and rsync have options for integrity checking. Many file distribution systems (software repositories, torrent files) publish hashes alongside files as standard practice.
For critical data — medical records, engineering designs, financial data — always verify transfers with a hash. For less critical data, the underlying TCP checksums provide reasonable assurance of integrity, and additional verification may not be worth the overhead.
Building a Reliable File Transfer System
For a robust file transfer system in a reconstruction context, the following design satisfies most needs with minimal complexity.
Use rsync over SSH for all regular file transfers and synchronization. rsync handles large files efficiently, resumes interrupted transfers, and provides delta synchronization for maintained directories. SSH provides authentication and encryption. The combination requires only SSH server software on the destination machine and rsync on both machines.
For transfers where you cannot install software on the destination, FTP with a passive mode server is the fallback. Configure the FTP server to use a dedicated account with restricted access to only the directories it needs to serve. Never use the same credentials for FTP that you use for system administration.
For one-time transfers of very large files, consider splitting the file into smaller pieces before transfer. Tools like split (Unix) divide a file into numbered pieces; the receiver reassembles them with cat. Smaller pieces are easier to verify individually and can be transferred in parallel if multiple connections are available.
Always log file transfers, especially for critical data. Record what was transferred, when, to where, and whether verification succeeded. This log is essential for diagnosing problems and for proving data integrity if the received file is later questioned.