What happens when you type any URL in your browser and press Enter
The moment a user types a URL into a web browser, we must assume that that user is sitting on the famous OSI Model and in fact, he must be on the Application layer, which is 7th layer of the OSI model already. This tells us that even at this stage, we are jumping a whooping 6 other layers in the OSI model straight to the Application layer, which is the 7th
and the last on the OSI model and a lot had happened behind the scene before even getting here. It just happens so insanely quickly. Well, how we got to this 7th layer is actually a long story for another day. For now, lets see what happens when we surf a web page e.g. https://www.paswebs.com or https://www.holbertonschool.com
What I am going to do is first of all break it down into 7 simple steps and then go ahead to explain in full words.
- 1. You enter a URL into the web browser
- 2. The browser looks up the IP address for the entered domain name via DNS
- 3. The browser sends a HTTP request which hits an IP on the appropriate port, the request is encrypted if on HTTPsm then the traffic goes through a firewall, the request is distributed via a load balancer(if any)m then LB to appropriate webserver.
- 4. The server sends back a HTTP response
- 5. The browser begins rendering the HTML body of the response
- 7. Once the page is loaded, the browser sends further async requests if needed e.g. ajax.
That is a simple (maybe over simple) explanation. If you still don't get it, then you need to keep reading. Let's take it one at a time, this time, I will be a little bit more detailed.
Here is how we surf a web page step by step:
- 1. You type a URL into the address bar in your preferred browser.
- 2. The browser parses the URL to find the protocol, host, port, and path.
- 3. It forms a HTTP request (that was most likely the protocol).
- 4. To reach the host, it first needs to translate the human-readable host (or if you like a domain name) into an IP number, and it does this by doing a DNS lookup on the host.
- 5. Then a socket needs to be opened from the user's computer to that IP number, on the port specified (most often port 80)
- 6. When a connection is open, the HTTP request is sent to the host
- 7. The host forwards the request to the server software (most often Apache) configured to listen on the specified port
- 8. The server inspects the request (most often only the path) and launches the server plugin needed to handle the request (corresponding to the server language you use, PHP, Java, .NET, Python?)
- 9. The plugin gets access to the full request and starts to prepare a HTTP response.
- 10. To construct the response a database is (most likely) accessed. A database search is made, based on parameters in the path (or data) of the request
- 11. Data from the database, together with other information the plugin decides to add, is combined into a long string of text (probably HTML).
- 12. The plugin combines that data with some metadata (in the form of HTTP headers) and sends the HTTP response back to the browser.
- 13. The browser receives the response and parses the HTML (which with 95% probability is broken) in the response
- 14. A DOM tree is built out of the broken HTML
- 16. Stylesheets are parsed, and the rendering information in each gets attached to the matching node in the DOM tree
- 18. The browser renders the page on the screen according to the DOM tree and the style information for each node.
- 19. Yay!!! you see the page on your screen.
- 20. But you get annoyed the whole process was too slow if you have a low bandwidth.
More explanation in words.
After hitting the URL, the first thing that needs to happen is to resolve IP address associated with the domain name. DNS helps in resolving this. DNS is like a phone book and helps us to provide the IP address that is associated with the domain name just like our phone book gives a mobile number which is associated with the person's name.
This is the overview, but there are four layers through which this domain name query goes through. Let's understand the steps:
After hitting the URL, the browser cache is checked. As browser maintains its DNS records for some amount of time for the websites you have visited earlier. Hence, firstly, DNS query runs here to find the IP address associated with the domain name. The second place where DNS query runs in OS cache followed by the router cache. If in the above steps, a DNS query does not get resolved, then it takes the help of the resolver server. Resolver server is nothing but your ISP (Internet service provider). The query is sent to ISP where DNS query runs in ISP cache. If in 3rd steps as well, no results found, then a request sent to top or root server of the DNS hierarchy. There it never happens that it says no results found, but actually it tells, from where this information you can get. If you are searching IP address of the top level domain (.com,.net,.Gov,. org). It tells the resolver server to search TLD server (Top level domain).
Now, the resolver asks TLD server to give IP address of our domain name. TLD stores address information of domain names. It tells the resolver to ask the Authoritative Name server. The authoritative name server is responsible for knowing everything about the domain name. Finally, the resolver (or if you like ISP) gets the IP address associated with the domain name and sends it back to the browser. After getting an IP address, the resolver stores it in its cache so that next time, if the same query comes then it does not have to go to all these steps again. It can now provide IP address from their cache.
Once the IP address of the computer (where your website information is there) is found, it initiates a connection with it. To communicate over the network, internet protocol is followed. TCP/IP is the most common protocol. A connection is built between two using a process called 'TCP 3-way handshake'. Let's understand the process in brief:
- 1. A client computer sends a SYN message means, whether second computer is open for new connection or not.
- 2. Then another computer, if open for new connection, it sends an acknowledge message with SYN message as well.
- 3. After this, first computer receives its message and acknowledge by sending an ACK message.
Communication Starts (Request-Response Process)
Finally, the connection is built between the client and server. Now, they both can communicate with each other and share information. After a successful connection, the browser (client) sends a request to a server that I want this content. The server knows everything of what response it should send for every request. Hence, the server responds No alt text provided for this image
back. This response contains every information that you requested like web page, status-code, cache-control, etc. Now, the browser renders the content that has been requested.
That's basically all that there is to it.
References and Sources of information
- 1 - https://www.ntu.edu.sg/home/ehchua/programming/webprogramming/http_basics.html
- 2 - https://stackoverflow.com/questions/2092527/what-happens-when-you-type-in-a-url-in-browser
- 3 - https://www.geeksforgeeks.org/what-happens-when-we-type-a-url/
- 4 - https://afteracademy.com/blog/what-happens-when-you-type-a-url-in-the-web-browser