Jasper Wang’s blog

In a quiet place, a person can hear his own thoughts.

FTP and SFTP

Posted at — Oct 11, 2019


A brief introduction for FTP

FTP stands for “File Transfer Protocol.” Generally, it is a way to transfer files online. And it is just doing its jobs like HTTP, IMAP, or POP. You probably know that there are a lot of file transfer protocols on the current internet; FTP is one of this group.


different protocals around us.


FTP consists two parts: Server and Client. An FTP server offers access to a directory with sub-directories. Users connect to these servers with an FTP client, a piece of software that lets you download files from the server, as well as upload files to it.

To understand it easily. FTP Server is like a Video rental store. It was filled with hundreds of videotapes or cd on the shelf ( You can imagine these as the files in a server/computer ), the store is waiting for people to rent (In the real case, it’s copy it rather than rent it) them back home or bring their rented tapes back to the store. Customers who come here are considered to be the FTP Clients; they do rent (download) and return (upload). That is how FTP works. To do this, the clients should have a FTP software, and the FTP server should offer its own address for clients to get access. Besides, Clients need a username and a password to log in.


download and upload like a video store.


FTP operates on TCP, you probably know the 3 steps handshake for TCP. When client sends a request to the server, the first thing comes along the 3 steps handshake to build the connection, and connect to corresponding port, in the FTP case, the client connects to port 21.

TCP belongs to the Transport layer. The function of the “transport layer” is to establish “port-to-port” communication. By the way, the “network layer” function is to create a “host to host” connection. As long as the host and port are determined, we can communicate between the programs.


FTP Mode

There are two modes of communication between the FTP server and the FTP client: active and passive. To not being confused, here I will explain it step by step.

The first thing to keep in mind is that, when mentioned the “active” and “passive”, they are all relative to the server side, not command. That is to say, Active Mode means the server is the active part, Passive Mode means the server is the passive part. So don’t confuse yourself with the client side.

Secondly, the ports in this case is a little bit confusing. To understand it easily, first you need to remember the two fixed ports: 20 and 21. And they are all relative to the server, you can think the same way as the mode unstanding in the previous paragraph. Then, 21 is for sending command, 20 is for sending data.

Now let’s take a deeper look at the two modes.

Active Mode

Active mode is also called PORT mode,it is the connection method originally defined by the FTP protocol. In the process of establishing a data connection, the server actively initiates the connection, so it is called an active mode.

How active mode works.

Passive Mode

Passive mode is also named PASV. In the whole process, because the server always passively receives the data connection from the client, it is called passive mode.

How passive mode works.

Why port?

If you are still confused about the term - port, here I have a brief Explanation.

There are many programs on the same host that need to use the network. For example, you can chat online with friends while browsing the web. When a packet is sent from the Internet, how do you know whether it represents the content of a web page or the content of an online chat?

In other words, we also need a parameter to indicate which program (process) this packet is for. This parameter is called “port”, which is actually the number of each program that uses the network card. Each packet is sent to a specific port on the host, so different programs can get the data they need.


Now back to FTP, it uses 2 ports, 20 and 21. Again, for the server side, 21 is for command, things like user authentication, open and close the connection. 20 is for data transmission, but it is not fixed, it depends on the FTP mode: port mode and passive mode. Because of the seperate ports, data connections and control connections are not confusing.

port to port connection for FTP.


What is the difference between FTP and HTTP?


A brief history about FTP

FTP was developed in the early 1970s by Abhay Bhushan while he was a student at MIT. FTP was initially created to allow the secure transfer of files between servers and host computers over the ARPANET Network Control Program (a precursor to the modern internet).

Why two ports?

Someone might be curious about why FTP uses 2 ports? It seems the connection could have been easily specified on a single port. Given all the problems with firewalls and NATS with FTP, it seems that a single port would have been much better.

I found that is a very interesting question. Actually, FTP was specified before NAT(Network Address Translation), Firewalls and Full-duplex Ethernet were the norm.

To explain it easily, Legacy Ethernet is half-duplex, meaning information can move in only one direction at a time. So there will be collisions in the this network. So FTP decided to use 2 ports to solve this issue. The following are the main reasons for two ports design:



Handle different things using 2 ports.


For example:

Let’s suppose Jack is the manager of Jacky and Lucy.

Jack wants two files from Jacky. So Jack connects to Jacky’s port 21 and asks for the files. Jacky opens the connection to Jack on port 20 (could be other port) when the Jacky is ready and send the files there to Jack. Meanwhile, Lucy needs an urgent approvement file from Jack. So Lucy still connects to Jack on 21 and asks for the file. Jack connects to port 20 when Jack is ready, because, right now, Jack is busy with Jacky.

Both ports serve a different purpose, and again for sake of simplicity, they chose to use two different ports instead of implementing a negotiation protocol.


A brief history about HTTP

In 1989, Dr. Tim Berners-Lee, who was working at CERN, wrote a report on building a hypertext system over the network. It is built on top of the existing TCP and IP protocols and consists of four parts: HTML, HTTP, WorldWideWeb(Browser), Server.

These four parts were completed at the end of 1990, On August 16, 1991, Tim Berners-Lee’s article on the public hypertext newsgroup was seen as the beginning of a public domain project on the World Wide Web.

HTTP was very simple in the early stages of the application, and was later called HTTP/0.9, sometimes called a one-line protocol.


Early FTP

So, FTP comes earlier then HTTP, The first FTP client application was a command-line program developed before operating systems had graphical user interfaces. Most network interfaces were using command-line interface at that time. And the way to use FTP in command line is easy to operate.

  1. Open a connect to the server, For emample : open 192.168.1.15

  2. Put your username and password.

  3. Verified and connected.

Then, you can Putting(upload) and getting(download) files using put and get command. Other operations are almost the same as the way you operate a linux system, like pwd, cd,ls, quit, etc.

FTP is handful tool at this time.


Several Differences


Is the FTP protocol obsolete?

There is a heated debate on this topic; it’s tough for me to draw a conclusion. However, from my view, as a beginning learner to Network, I hear HTTP more than FTP. The reason is that the functions that the FTP protocol needs to implement, including file upload and download, authentication, and resume transmission, can be completed by other protocols such as HTTP and SFTP.

There are still downsides for FTP, this file transfer protocol is not so strong on security. So it’s not recommend to send sensitive data on FTP, that’s why we have SFTP. In another word, FTP is to SFTP as HTTP is to HTTPS

For the vast majority of ordinary users, there are very few opportunities to download via FTP. But FTP is still possible on the intranet, like the school network, library network system. Also, FTP is a very good choice in some commercial closed and downloadable resources. In practice, FTP is like telnet, it’s no longer recommended for use, and SFTP should always be the first choice when you need FTP’s functionality.


Now, SFTP

SFTP is short for “Secure File Transfer Protocol”.

Literally, SFTP is a more secure way to transfer files online. SFTP encrypts the data before sending. SFTP is more for the Client rather than Server.


What is SSH?

SSH, also known as Secure Shell or Secure Socket Shell, is a network protocol that provides administrators with a secure way to access a remote computer. SSH establishes a cryptographically secured connection between two parties(client and server), authenticating each side to the other, and passing commands and output back and forth.

SSH consists the client software and the server software. Both of them are talking on the port 22. The client softwares include scp, slogin, sftp.

SFTP tallks on part 22.


FTP and SFTP


Here is an example for SFTP client, I use the Cyberduck to upload files to my remote server on digital ocean.

I can also use ssh to log into my server. The same way as I mentioned before.

Conclusion

Today, SFTP is still widely used in the field of encrypted file transmission.

Although we have fewer and fewer opportunities to use FTP, understanding FTP is still very helpful for understanding the network.