HTTP協議

HTTP協議和Web的工作方式的詳細說明

HTTP(超文本傳輸協議)是TCP / IP的應用協議之一,TCP / IP是為Internet提供支持的協議套件。

我來解決:不是在所有協議中,它絕對是最成功和最受歡迎的協議。

HTTP是使萬維網起作用的原因,它為瀏覽器提供了一種與承載網頁的遠程服務器進行通信的語言。

由於1989年以來蒂姆·伯納斯·李(Tim Berners-Lee)在歐洲核研究中心CERN所做的工作,HTTP於1991年首次標準化。

目的是使研究人員可以輕鬆地交換和鏈接他們的論文。這是作為科學界更好地工作的一種方式。

那時,Internet的主要應用程序主要包括FTP(文件傳輸協議),Email和Usenet(新聞組,今天幾乎被棄用)。

1993年,第一個圖形化網絡瀏覽器Mosaic發行了,事情從此暴漲。

Web成為Internet的殺手級應用程序。

隨著時間的流逝,Web及其周圍的生態系統已經發生了巨大的發展,但基礎仍然存在。演進的一個例子:HTTP現在不僅為網頁提供了REST API,而且還提供了REST API,這是一種通過Internet以編程方式訪問服務的常用方法。

HTTP在1997年通過HTTP / 1.1進行了次要修訂,並在2015年進行了後續修訂,HTTP/2已標準化,並且現在由全球使用的主要Web服務器實施。

與沒有通過加密連接提供服務的任何其他協議(SMTP,FTP ..)一樣,HTTP協議被認為是不安全的。這就是為什麼如今大力推動使用HTTPS的原因,HTTPS是通過TLS服務的HTTP。

就是說,HTTP / 2和HTTPS的構建基塊起源於HTTP,在本文中,我將介紹HTTP的工作原理。

HTML文件

HTTP是方法網頁瀏覽器例如Chrome,Firefox,Edge等客戶從這裡開始)與網絡服務器

超文本傳輸協議的名稱源於不僅需要傳輸文件(例如在FTP中稱為“文件傳輸協議”),還需要傳輸超文本,該超文本將使用HTML編寫,然後由瀏覽器以圖形化的方式呈現,並具有精美的外觀和交互性。鏈接。

鏈接是推動採用的驅動力,同時易於創建新的網頁。

HTTP是通過網絡傳輸那些超文本文件(以及我們還將看到的圖像和其他文件類型)的工具。

在Web瀏覽器中,文檔可以使用鏈接指向另一個文檔。

鏈接由第一部分組成,該第一部分通過域名或IP確定協議和服務器地址。

當然,這部分不是HTTP獨有的。

然後是文檔部分。地址部分的任何內容都代表文檔路徑。

例如,此文檔地址為https://flaviocopes.com/http/

  • https是協議。
  • flaviocopes.com是指向我的服務器的域名
  • /http/是相對於服務器根路徑的文檔URL。

路徑可以嵌套:https://flaviocopes.com/page/privacy/在這種情況下,文檔網址為/page/privacy

Web服務器負責解釋請求,並在分析後提供正確的響應。

一個要求

請求中有什麼?

第一件事是網址,我們之前已經看過。

當我們輸入一個地址並在瀏覽器中按Enter時,服務器會將其發送至正確的IP地址,如下所示:

GET /a-page

where /a-page is the URL you requested.

The second thing is the HTTP method (also called verb).

HTTP in the early days defined 3 of them:

  • GET
  • POST
  • HEAD

and HTTP/1.1 introduced

  • PUT
  • DELETE
  • OPTIONS
  • TRACE

We’ll see them in detail in a minute.

The third thing that composes a request is a set of HTTP headers.

Headers are a set of key: value pairs that are used to communicate to the server-specific information that is predefined, so the server can know what we mean.

I described them in detail in the HTTP request headers list.

Give that list a quick look. All of those headers are optional, except Host.

HTTP methods

GET

GET is the most used method here. It’s the one that’s used when you type an URL in the browser address bar, or when you click a link.

It asks the server to send the requested resource as a response.

HEAD is just like GET, but tells the server to not send the response body back. Just the headers.

POST

The client uses the POST method to send data to the server. It’s typically used in forms, for example, but also when interacting with a REST API.

PUT

The PUT method is intended to create a resource at that specific URL, with the parameters passed in the request body. Mainly used in REST APIs

DELETE

The DELETE method is called against an URL to request deletion of that resource. Mainly used in REST APIs

OPTIONS

When a server receives an OPTIONS request it should send back the list of HTTP methods allowed for that specific URL.

TRACE

Returns back to the client the request that has been received. Used for debugging or diagnostic purposes.

HTTP Client/Server communication

HTTP, as most of the protocols that belong to the TCP/IP suite, is a stateless protocol.

Servers have no idea what’s the current state of the client. All they care about is that they get request and they need to fulfill them.

Any prior request is meaningless in this context, and this makes it possible for a web server to be very fast, as there’s less to process, and also it gives it bandwidth to handle a lot of concurrent requests.

HTTP is also very lean, and communication is very fast in terms of overhead. This contrasts with the protocols that were the most used at the time HTTP was introduced: TCP and POP/SMTP, the mail protocols, which involve lots of handshaking and confirmations on the receiving ends.

Graphical browsers abstract all this communication, but we’ll illustrate it here for learning purposes.

A message is composed by a first line, which starts with the HTTP method, then contains the resource relative path, and the protocol version:

GET /a-page HTTP/1.1

After that, we need to add the HTTP request headers. As mentioned above, there are many headers, but the only mandatory one is Host:

GET /a-page HTTP/1.1
Host: flaviocopes.com

How can you test this? Using telnet. This is a command-line tool that lets us connect to any server and send it commands.

Open your terminal, and type telnet flaviocopes.com 80

This will open a terminal, that tells you

Trying 178.128.202.129...
Connected to flaviocopes.com.
Escape character is '^]'.

You are connected to the Netlify web server that powers my blog. You can now type:

GET /axios/ HTTP/1.1
Host: flaviocopes.com

and press enter on an empty line to fire the request.

The response will be:

HTTP/1.1 301 Moved Permanently
Cache-Control: public, max-age=0, must-revalidate
Content-Length: 46
Content-Type: text/plain
Date: Sun, 29 Jul 2018 14:07:07 GMT
Location: https://flaviocopes.com/axios/
Age: 0
Connection: keep-alive
Server: Netlify

Redirecting to https://flaviocopes.com/axios/

See, this is an HTTP response we got back from the server. It’s a 301 Moved Permanently request. See the HTTP status codes list to know more about the status codes.

It basically tells us the resource has permanently moved to another location.

Why? Because we connected to port 80, which is the default for HTTP, but on my server I set up an automatic redirection to HTTPS.

The new location is specified in the Location HTTP response header.

There are other headers, all described in the HTTP response headers list.

In both the request and the response, an empty line separates the request header from the request body. The response body in this case contains the string

Redirecting to https://flaviocopes.com/axios/

which is 46 bytes long, as specified in the Content-Length header. It is shown in the browser when you open the page, while it automatically redirects you to the correct location.

In this case we’re using telnet, the low-level tool that we can use to connect to any server, so we can’t have any kind of automatic redirect.

Let’s do this process again, now connecting to port 443, which is the default port of the HTTPS protocol. We can’t use telnet because of the SSL handshake that must happen.

Let’s keep things simple and use curl, another command-line tool. We cannot directly type the HTTP request, but we’ll see the response:

curl -i https://flaviocopes.com/axios/

this is what we’ll get in return:

HTTP/1.1 200 OK
Cache-Control: public, max-age=0, must-revalidate
Content-Type: text/html; charset=UTF-8
Date: Sun, 29 Jul 2018 14:20:45 GMT
Etag: "de3153d6eacef2299964de09db154b32-ssl"
Strict-Transport-Security: max-age=31536000
Age: 152
Content-Length: 9797
Connection: keep-alive
Server: Netlify

<!DOCTYPE html> <html prefix=“og: http://ogp.me/ns#” lang=“en”> <head> <meta charset=“utf-8”> <meta http-equiv=“X-UA-Compatible” content=“IE=edge”> <title>HTTP requests using Axios</title> …

I cut the response, but you can see that the HTML of the page is being returned now.

Other resources

An HTTP server will not just transfer HTML files, but typically it will also serve other files: CSS, JS, SVG, PNG, JPG, lots of different file types.

This depends on the configuration.

HTTP is perfectly capable of transferring those files as well, and the client will know about the file type, thus interpret them in the right way.

This is how the web works: when an HTML page is retrieved by the browser, it’s interpreted and any other resource it needs to display property (CSS, JavaScript, images..) is retrieved through additional HTTP requests to the same server.


More network tutorials: