Sarasota PC Monitor
Tech Talk (6/05)
Internet Infrastructure & Operation
by Brian K. Lewis, Ph.D.*
Member of the Sarasota Personal Computer Users Group, Inc.Were you ever curious about how web pages get to your computer or how your e-mail knows how to find the recipient? I know, you probably never thought about it as long as everything worked the way it should. Well, sit back, relax and take a deep breath. We are about to embark on a tour of how the Internet gets things done.
First of all, the Internet is a network of networks. All this means is that small networks, like home networks and business networks connect to larger networks that cover the U.S. These networks also connect to international networks via cable or satellite connections. The points where these networks interconnect are referred to as network access points. There are multiple network access points to these large networks throughout the country, in fact, throughout the world.
There are thousands of routers and servers connected to this network. They are the ones that do the work of identifying and distributing all the requests, answers, e-mail, etc. Routers may be of the simple variety as found in a home/small office network. It receives data packets from the outside or the inside of the network. For example, you send an e-mail to me at Yahoo.com. The first thing that happens is your e-mail software reads the address of your SMTP server from the setup information you entered when you installed the software. SMTP standing for Standard Mail Transfer Protocol. It then connects to the computer that is the SMTP server. On receipt of the e-mail, the SMTP server parses the header to extract the yahoo.com. If it finds yahoo.com in its routing table, it will convert the text "yahoo.com" to 216.109.112.135. This "dotted decimal number" is the 32-bit IP address for yahoo. Servers have static IP addresses whereas home computers that use a dial-up connection usually get a new IP address every time they connect. Now, if the SMTP server does not have yahoo.com in its routing table, it will connect with a DNS (domain name server) server and ask it for the address. It may be necessary to go through several levels of DNS servers to finally get the IP address for yahoo.com.
Once the IP address is determined, this is added to the packet header and the e-mail is sent on its way. It may pass through a number of different routers before it gets to yahoo. However, with the IP address in the header, each router will send it out via the path with the least available traffic. Once arriving at yahoo, the recipient is identified from the header and it is passed to the proper POP (post office protocol) or IMAP (Internet mail access protocol) server. This whole process takes far less time than you just spent reading it.
O.K, so far we have seen that the Internet has routers, mail sending servers, mail receiving servers and domain name servers. It also has web servers. These are the computers that maintain the web pages everyone is familiar with. When you enter a destination address in the address line of your browser, a request is sent to the web server to "get" the page. So the browser, Internet Explorer in most cases, takes the address (http://somedomain.com) and breaks it into its components. The first part is the protocol, http or hypertext transmission protocol, and then it obtains the IP address for somedomain.com. Next the browser sends a request to the server. The server responds by sending the HTML text for the page back to your computer. HTML being hypertext markup language or the programming language for producing web pages. The browser reads the text and displays it on your computer screen. This, of course, is the simplest case. In a real situation the transmission path will go through various servers before reaching the final destination. It may also be necessary for a DNS server to be involved in getting the destination IP address just as in the e-mail example.
There is one other aspect of this operation to consider. The interaction of the browser and the HTTP server occurs through port 80 on your computer. E-mail transactions utilize port 25 for SMTP, 110 for POP3 and 143 for IMAP. There are also other ports for other purposes. In fact, your computer has ports numbered from 0 to 65,536. Ports are nothing more than network connections that can be made between your computer and another one. Multiple connections can be made at the same time using different ports. If any of these ports are open, or unprotected connections, they can be used by people outside your computer to connect with your computer without your knowledge. There are software programs specifically designed to look for unprotected ports. These scanning programs check thousands of computers in a fraction of a minute for any open ports that can be used for a connection. If you have an unprotected port they can install a Trojan Horse program that can use your computer to attack others. Or, it may just steal passwords or other personal information. Just another reason to use a router and/or a software firewall to prevent penetration and takeover of your computer.
Now most of you are familiar with interactive web pages, those where you enter some information and the page is updated. These are dynamic pages as opposed to static pages that just contain information that you read. The dynamic pages might be ones where you are placing an order from an Internet store, as one example. In most cases the action carried out by the web server depends on a CGI script. CGI is another simple programming language that is in wide spread use on the Internet. In some cases, the scripting language may be JAVA instead of CGI. The point is that your entry on the page results in a pre-programmed response on the part of the web server.
In some cases the script may place a cookie on your computer's hard drive. This would be true if you were making a purchase from an Internet store. This cookie provides a record of the contents of your cart or basket. Otherwise the information would be lost when you move on to another page, even on the same site. These cookies are usually deleted when the purchase transaction is completed. However, if you close your browser or leave the company's Internet site, the cookie may be retained for up to thirty days. That way, if you return to the purchase site, you can complete the transaction even if several days have passed.
The more you delve into the details of the operation of the Internet, the more complex it becomes. For example, how a router handles each packet it receives can become a very complex series of steps. So, in this tour I have just tried to provide an overview of the total operation. If you want more details, try looking up howstuffworks.com. You'll be amazed at the range of information that can be found on this web site. Happy surfing!
*Dr. Lewis is a former university & medical school professor. He has been working with personal computers for more than thirty years. He can be reached via e-mail at bwsail@yahoo.com.
Copyright 2005. This article is from the June 2005 issue of the Sarasota PC Monitor, the official monthly publication of the Sarasota Personal Computer Users Group, Inc., P.O. Box 15889, Sarasota, FL 34277-1889. Permission to reprint is granted only to other non-profit computer user groups, provided proper credit is given to the author and our publication. We would appreciate receiving a copy of the publication the reprint appears in, please send to above address, Attn: Editor. For further information about our group, email: admin@spcug.org/ Web: http://www.spcug.org/The Sarasota Personal Computer Users Group, Inc. has 1,100+ members and was established in 1982. We are members of the Assoc. of PC User Groups (APCUG), the Florida Assoc. of PC Users Groups, Inc., and we are members of the America Online Ambassador Program.
See http://www.spcug.org for all reviews from the Sarasota PC Monitor, go to the Newsletter Section.