THE INTERNET

The Internet is the world's largest computer network. It is a global information infrastructure comprised of millions of computers organized into hundreds of thousands of smaller, local networks. The term "information superhighway" is sometimes used to describe the function that the Internet provides: an international, high-speed telecommunications network that offers open access to the general public.

The Internet provides a variety of services, including electronic mail (e-mail), the World Wide Web (WWW), Intranets, File Transfer Protocol (FTP), Telnet (for remote login to host computers), and various file-location services.

E-MAIL

Electronic mail, or e-mail, is the most widely used function used on the Internet today. Millions of messages are passed via Internet lines every day through-out the world. Compared to postal service, overnight delivery companies, and telephone conversations, e-mail via the Internet is extremely cost-effective and fast. E-mail facilities include sending and receiving messages, the ability to broadcast messages to several recipients at once, storing and organizing messages, forwarding messages to other interested parties, maintaining address books of e-mail partners, and even transmitting files (called "attachments") along with messages.

Internet e-mail messages are sent to an e-mail address. The structure of an e-mail address is as follows: PersonalID@DomainName

The personal identifier could be a person's name or some other way to uniquely identify an individual. The domain is an indicator of the location of that individual, and appears to the right of the "at" (@) sign. A domain name is the unique name of a collection of computers that are connected to the Internet, usually owned by or operated on the behalf of a single organization (company, school, or agency) that owns the domain name. The domain name consists of two or more sections, each separated by a period.

From right-to-left, the portions of the domain name are more general to more specific in terms of location. In the United States, the rightmost portion of a domain is typically one of the following:

com—indicating a commercial enterprise
edu—indicating an educational institution
gov—indicating a governmental body
mil—indicating a military installation
net—indicating a network resource
org—indicating a nonprofit organization

In November of 2000 seven new domain names were created and made available: biz, .info, .name, .pro, .aero, .coop, and .museum.

In non-U.S. countries, the rightmost portion of a domain name is an indicator of the geographic origin of the domain. For example, Canadian e-mail addresses end with the abbreviation "ca."

SPAM.

Commercial abuse of e-mail continues to be problematic as companies attempt to e-mail millions of online users in bulk. This technique is called "spam," so named after a skit by the comedy troupe Monty Python that involved the continuous repetition of the word. Online users are deluged with a massive amount of unwanted e-mail selling a wide array of products and services. Spam has become a network-wide problem as it impacts information transfer time and overall network load. Several organizations and governments are attempting to solve the spam problem through legislation or regulation.

VIRUSES.

Computer viruses spread by e-mail have also grown as the Internet has grown. The widespread use of e-mail and the growing numbers of new, uninformed computer users has made it very easy to spread malicious viruses across the network. Security issues for both personal computers and for network servers will continue to be a crucial aspect of the ongoing development of the Internet and World Wide Web.

WORLD WIDE WEB

The World Wide Web (WWW) is a system and a set of standards for providing a graphic user interface (GUI) to Internet communications. The WWW is the single most important factor in the popularity of the Internet, because it makes the technology easy to use and gives attractive and entertaining presentation to users.

Graphics, text, audio, animation, and video can be combined on Web pages to create dynamic and highly interactive access to information. In addition, Web pages can be connected to each other via hyperlinks. These hyperlinks are visible to the user as high-lighted text, underlined text, or images that the user can click to access another Web page.

BROWSERS.

Web pages are available to users via Web browsers, such as Mozilla/Firefox, Netscape Navigator, or Microsoft's Internet Explorer. Browsers are programs that run on the user's computer and provide the interface that displays the graphics, text, and hyper-links to the user. Browsers recognize and interpret the programming language called Hypertext Markup Language (HTML). HTML includes the ability to format and display text; size and position graphics images for display; invoke and present animation or video clips; and run small programs, called applets, for more complex interactive operations. Browsers also implement the hyperlinks and allow users to connect to any Web page they want.

SEARCH ENGINES.

Sometimes a user knows what information she needs, but does not know the precise Web page that she wants to view. A subject-oriented search can be accomplished with the aid of search engines, which are tools that can locate Web pages based on a search criterion established by the user. Commonly used search engines include Google, Yahoo, Teoma, and Alta Vista.

BLOGS.

The ease with which users can publish their own information using the World Wide Web has created an opportunity for everyone to be a publisher. An outcome from this is that every topic, hobby, niche, and fetish now has a thriving community of like-minded people. The ease of publishing information on the Web became easier with the advent of Web logs or "blogs," online diaries that opened the floodgates to an even greater level of individual participation in information sharing and community.

UNIFORM RESOURCE LOCATORS (URL)

A Uniform Resource Locator (URL) is a networked extension of the standard filename concept. It allows the user to point to a file in a directory on any machine on the Internet. In addition to files, URLs can point to queries, documents stored deep within databases, and many other entities. Primarily, however, URLs are used to identify and locate Web pages.

A URL is composed of three parts:

PROTOCOL.

This is the first part of the address. In a Web address, the letters "http" stand for Hypertext Transfer Protocol, signifying how this request should be dealt with. The protocol information is followed by a colon. URL protocols usually take one of the following types:

http—for accessing a Web page
ftp—for transferring a file via FTP
file—for locating a file on the client's own machine
gopher—for locating a Gopher server
mail—for submitting e-mail across the Internet
news—for locating a Usenet newsgroup

RESOURCE NAME.

This is the name of the server/machine at which the query should be directed. For an "http" request, the colon is followed by two forward slashes, and this indicates that the request should be sent to a machine.

PATH AND FILE NAME.

The rest of a URL specifies the particular computer name, any directory tree information, and a file name, with the latter two pieces of information being optional for Web pages. The computer name is the domain name or a variation on it (on the Web, the domain is most commonly preceded by a machine prefix "www" to identify the computer that is functioning as the organization's Web server, as opposed to its e-mail server, etc.).

If a particular file isn't located at the top level of the directory structure (as organized and defined by whoever sets up the Web site), there may be one or more strings of text separated by slashes, representing the directory hierarchy.

Finally, the last string of text to the right of the rightmost slash is the individual file name; on the Web, this often ends with the extension "htm" or "html" to signify it's an HTML document. When no directory path or file name is specified (e.g., the URL http://www.domain.com ), the browser is typically pointed automatically to an unnamed (at least from the user's perspective) default or index page, which often constitutes an organization's home or start page.

Thus, a full URL with a directory path and file name may look something like this:

http://www.mycompany.com/files/myfile.html

Lastly, a Web URL might also contain, somewhere to the right of the domain name, a long string of characters that does not correspond to a traditional directory path or file name, but rather is a set of commands or instructions to a server program or database application. The syntax of these URLs depends on the underlying software program being used. Sometimes these can function as reusable URLs (e.g., they can be bookmarked and retrieved repeatedly), but other times they must be generated by the site's server at the time of use, and thus can't be retrieved directly from a bookmark or by typing them in manually.

INTERNET SERVICE PROVIDERS

To gain access to the Internet a user typically subscribes to an Internet Service Provider (ISP). ISPs are companies that have permanent connection to the Internet. Subscribers to ISPs can connect to the ISP's server computer, and through that connection can gain access to the Internet. Some well-known commercial ISPs include America Online, MSN, and Earthlink, although there are hundreds of such services.

An alternative access to the Internet is provided via academic institutions (i.e. colleges or universities) and government agencies. Most students and faculty in colleges have accounts on the school's computer system, through which they can gain access to the Internet.

TRANSMISSION CONTROL PROTOCOL/
INTERNET PROTOCOL (IP)

The Internet is a network of computers, or more accurately, a vast network of networks. These networks are connected to each other via a high-speed backbone, a communication link that joins the major Internet host computers. These hosts are primarily mainframe computers at academic institutions. The communication along the Internet follows the Transmission Control Protocol (TCP)/Internet Protocol (IP) communications standard.

TCP is called a connection-based protocol, which enables two hosts to establish a direct connection and exchange streams of data. TCP guarantees delivery of data and also guarantees that packets of data will be delivered in the same order in which they were sent. In this regard, TCP acts like a telephone conversation.

IP is a connectionless protocol, acting something like the postal system. It allows you to address a data packet and drop it in the system, but there's no direct link between you and the recipient.

Each node (or computer) on the Internet is assigned a unique IP address. IP addresses are 32-bit numbers, normally written as 4 octets (in decimal), e.g., 128.8.4.5. These identify the particular network and host. IP addresses are numeric values. However, most users prefer symbolic names to identify the hosts they want to access. Thus, the Internet provides the Domain Name Service (DNS), which allows users to use symbolic names to locate Internet hosts.

INTRANETS

Internet technology has become extremely beneficial for businesses and other organizations as a cost-effective means of implementing their corporate-wide telecommunications needs. However, the public nature of the Internet poses a challenge to any company wishing to take advantage of its potential. The TCP/IP protocol does not provide adequate security for commercial institutions. It is relatively easy to eavesdrop on transmissions, and there is no inherent authentication mechanism. Thus, many companies adopt intranets, private networks based on Internet technology.

Intranets use the company's existing network infrastructure, together with the TCP/IP protocol, Web browsers, Web server technologies, and HTML-formatted Web pages. The key distinction between an intranet and the Internet is the use of a firewall, which is a security system with specialized software and/or hardware that can prevent unauthorized users from gaining access to the company's intranet server.

Intranets have many advantages over other corporate-wide network implementations. They are comparatively inexpensive to implement and easily allow different types of computers to communicate with each other, which overcomes a major obstacle to corporate-wide information sharing.

Most companies have a wide variety of computer platforms, including PCs, mainframes, and minicomputers, spread throughout the organization. Web technology and TCP/IP communications standards enable these diverse platforms to maintain a consistent user interface, thus reducing the amount of time it takes users to become proficient on the network.

Intranets are used for many business purposes, ranging from distribution of corporate documents to facilitating group collaboration via groupware and teleconferencing, to full-blown transaction-processing applications.

FILE TRANSFER PROTOCOL

The File Transfer Protocol (FTP) is a method of moving files between two Internet sites. Files can contain software, text, graphics, or other file formats.

Early Internet users developed FTP so researchers could copy files from one place to another across the Internet. Until 1995 and the popularization of the World Wide Web, FTP accounted for more traffic on the Internet than any other service. Nowadays, the bulk of traffic is done via the Web. However, even when downloading files via an Internet browser, FTP is the protocol involved. In this case, the URL begins with the protocol "ftp://" instead of "http://".

Although using FTP to transfer files from one system to another usually requires a user ID on both systems, many host systems provide anonymous FTP services. Anonymous FTP lets anyone in the world have access to a certain area of disk space on the host system and allows some files to be publicly available. Some systems have dedicated entire disks or even entire computers to maintaining extensive archives of source code and information. These sites are called anonymous FTP servers.

Once a user logs onto an FTP server, he or she can transfer data to or from that server using common FTP commands. The basic syntax for FTP commands is based on the UNIX operating system; however, many software products are available that provide graphic interfaces to FTP and thus simplify the file transfer process.

ARCHIE, GOPHER, AND VERONICA

As FTP sites proliferated over the Internet, it became necessary to create directories and indexes to allow Internet users to quickly locate desired information. Three commonly used tools exist for this purpose. An Archie is a database server that provides keyword search to locate relevant FTP files. Gophers, originally developed at the University of Minnesota, are menu-oriented directories to FTP files and sites. The menus are arranged in hierarchical structure based on topics, and are hyperlinked to FTP sites and even to other Gopher sites. Finally, a Veronica (Very Easy Rodent-Oriented Netwide Index) is a keyword-search tool that searches Gopher sites for relevant subject material.

Although these three tools are useful, their use has declined with the advent of the World Wide Web. One can think of the Archies, Gophers, and Veronicas of the world as being precursors to the modern search engines of the Web. In fact, Gophers are themselves accessible from the Web, and have their own URL protocol.

TELNET

Telnet is the Internet standard protocol for remote terminal connection, allowing a user at one site to interact with a remote timesharing system at another site as if the user's terminal were connected directly to the remote computer. A Telnet program is the terminal emulation software you use to log in to an Internet host; the host has similar Telnet software. Thus, via Telnet, your computer becomes a terminal connected to the host computer, and your interaction with that computer is the same as it would be if you were sitting at a terminal wired directly to that computer.

Telnet is a text-based connection protocol, providing only character-based communications capabilities with the host. Thus, Telnet has been greatly overshadowed by the Web, as there is limited content available by Telnet and it requires knowledge of various system commands. However, there are still many Telnet sites available. Most Internet browsers allow access to Telnet sites by specifying the Telnet protocol as the first part of a URL.

PEER-TO-PEER (P2P)

The peer-to-peer (P2P) protocol began to gain in popularity in the late 1990s and early 2000s. This protocol allows for the sharing of individual computer hard drives and storage devices. P2P spreads the network usage and downloading across all of the linked computers distributing the load more evenly. It also allows for a lack of accountability in serving and acquiring data. It has been extremely popular for downloading music, videos, and books.

The first and most well-known instance of P2P was Napster in 1999, a file sharing application for exchanging music files between users without regard to copyright restrictions or royalties. A significant amount of litigation took place between Napster and the Recording Industry Association of America to stop promoting copyright infringements. Napster eventually acquiesced to the legal actions; however P2P downloading continues to be a corporate issue.

USENET NEWSGROUPS

Usenet is an Internet news/discussion group forum that allows ongoing conversations on a given topic to occur over an extended period of time (weeks, months, and even years). These newsgroups are organized in a bulletin board framework, so that any Internet user can read or post messages to any topic area. Although Usenet newsgroups existed long before the WWW, they are still in wide use and accessible from the Web via their own URL protocol.

HISTORY OF THE INTERNET

The idea for the Internet began in the early 1960s as a military network developed by the U.S. Department of Defense's Advanced Research Project Agency (ARPA). At first, it was a small network called ARPANET, which promoted the sharing of super-computers amongst military researchers in the United States. A few years later, ARPA began to sponsor research into a cooperative network of academic time-sharing computers. By 1969, the first ARPANET hosts were constructed at Stanford Research Institute, University of California Los Angeles (UCLA), University of California Santa Barbara, and the University of Utah.

In the early 1970s, use of ARPANET expanded dramatically. Although it was originally designed to allow scientists to share data and access remote computers, e-mail quickly became ARPANET's most popular application, as researchers across the country used it for collaborating on research projects and discussing topics of interests.

In 1972, the InterNetworking Working Group (INWG) was established as the first standards-setting organization to govern the growing network. Under the leadership of Vinton Cerf, known as the "father of the Internet," INWG began to address the need for establishing agreed-upon protocols and enforce standardization in ARPANET functionality. Two early protocols, Telnet and FTP, are still in use today.

By 1973, ARPANET crossed national boundaries, establishing connections to University College in London, England, and the Royal Radar Establishment in Norway. In 1974, a commercial version of ARPANET, called Telenet, was developed by Bolt, Beranek, and Newman, Inc. (BBN), one of the original ARPA contractors that had helped get ARPANET running. It began a move away from the military/research roots of the original ARPANET.

In 1979, faculty members and graduate students at Duke University and the University of North Carolina created the first Usenet newsgroups, enabling users from all over the world join discussion groups on a myriad of subjects including politics, religion, computing, and even less-than-savory topics. Usenet influenced a continuing wave of growth.

Between 1981 and 1988, ARPANET grew from around 200 hosts to more than 60,000. Many factors influenced this explosive growth. First was the boom in the personal computer industry. With more people using inexpensive desktop machines, and with the advent of powerful, network-ready servers, many companies began to join this vast computer network for the first time, using it to communicate with each other and with their customers.

A second factor in growth was the National Science Foundation's NSFNET, built in 1986 for the purpose of connecting university computer science departments. NSFNET combined with ARPANET to form a huge backbone of network hosts. This backbone became what we now think of as the Internet (although the term "Internet" was used as early as 1982).

The third factor in growth was the concept of internetworking, which began to appear in popular culture in the 1980s. William Gibson's 1984 novel Neuromancer coined the ubiquitous term "cyberspace" to describe the new virtual communities, cultures, and geographies that the Internet provides.

The explosive growth of the Internet came with major problems, particularly related to privacy and security in the digital world. Computer crime and malicious destruction became a paramount concern. One dramatic incident occurred in 1988 when a program called the "Morris worm" temporarily disabled approximately 10 percent of all Internet hosts across the country. The Computer Emergency Response Team (CERT) was formed in 1988 to address such security concerns.

In 1990, as the number of hosts approached 300,000, the ARPANET was decommissioned, leaving only the Internet with NSFNET as its sole back-bone. The 1990s saw the commercialization of the Internet, made possible when the NSF lifted its restriction on commercial use and cleared the way for the age of electronic commerce.

Electronic commerce was further enhanced by new applications being introduced to the Internet. For example, programmers at the University of Minnesota developed the first point-and-click method of navigating the Internet files in 1991. This program, which was freely distributed on the Internet, was called Gopher, and gave rise to similar applications such as Archie and Veronica.

An even more influential development, also started in the early 1990s, was Tim Berners-Lee's work on the World Wide Web, in which hypertext-formatted pages of words, pictures, and sounds promised to become an advertiser's dream come true. At the same time, Marc Andreessen and colleagues at the National Center for Supercomputing Applications (NCSA), located on the campus of University of Illinois at Urbana-Champaign, were developing a graphical browser for the World Wide Web called Mosaic (released in 1993), which would eventually evolve into Netscape.

By 1995, the Internet had become so commercialized that most access to the Internet was handled through Internet service providers (ISPs), such as America Online and Netcom. At that time, NSF relinquished control of the Internet, which was now dominated by WWW traffic.

Partly motivated by the increased commercial interest in the Internet, Sun Microsystems released an Internet programming language called Java, which promised to radically alter the way applications and information can be retrieved, displayed, and used over the Internet.

By 1996, the Internet's twenty-fifth anniversary, there were 40 million Internet users, and Internet-based electronic commerce had reached major proportions, with more than $1 billion in Internet shopping mall transactions.

DIRECTION OF THE INTERNET

The Internet is now truly global, with 150 countries connected. In less than 30 years, the Internet migrated from an American military information management tool to an information superhighway serving the entire world.

The Internet revolutionized late twentieth and early twenty-first century society as dramatically as the railroads and the Industrial Revolution of the nineteenth century. Telecommuting, e-commerce, blogs, and virtual communities have broken geographic boundaries and brought people closer together.

At the same time, the internet has introduced significant social challenges. There is a danger of creating a second-class citizenship among those without access. Privacy and security are a continuing concern. The workplace is drastically altering society as the information age makes industrial-era skills obsolete. The twenty-first century will be strongly influenced by the dispersion of information technology, and the Internet promises to be the conduit of this technology.

Michel Mitri

Revised by Hal P. Kirkwood , Jr.

THE INTERNET

E-MAIL

SPAM.

VIRUSES.

WORLD WIDE WEB

BROWSERS.

SEARCH ENGINES.

BLOGS.

UNIFORM RESOURCE LOCATORS (URL)

PROTOCOL.

RESOURCE NAME.

PATH AND FILE NAME.

INTERNET SERVICE PROVIDERS

TRANSMISSION CONTROL PROTOCOL/
INTERNET PROTOCOL (IP)

INTRANETS

FILE TRANSFER PROTOCOL

ARCHIE, GOPHER, AND VERONICA

TELNET

PEER-TO-PEER (P2P)

USENET NEWSGROUPS

HISTORY OF THE INTERNET

DIRECTION OF THE INTERNET

FURTHER READING:

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

THE INTERNET

E-MAIL

SPAM.

VIRUSES.

WORLD WIDE WEB

BROWSERS.

SEARCH ENGINES.

BLOGS.

UNIFORM RESOURCE LOCATORS (URL)

PROTOCOL.

RESOURCE NAME.

PATH AND FILE NAME.

INTERNET SERVICE PROVIDERS

TRANSMISSION CONTROL PROTOCOL/ INTERNET PROTOCOL (IP)

INTRANETS

FILE TRANSFER PROTOCOL

ARCHIE, GOPHER, AND VERONICA

TELNET

PEER-TO-PEER (P2P)

USENET NEWSGROUPS

HISTORY OF THE INTERNET

DIRECTION OF THE INTERNET

FURTHER READING:

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

TRANSMISSION CONTROL PROTOCOL/
INTERNET PROTOCOL (IP)