Daily Archives: September 15, 2011

World Wide Web- WWW- W3 Detailed Study

World Wide Web

The World Wide Web (abbreviated as WWW or W3 and commonly known as the Web) is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images, videos, and other multimedia and navigate between them via hyperlinks.

Using concepts from earlier hypertext systems, British engineer and computer scientist Sir Tim Berners-Lee, now Director of the World Wide Web Consortium(W3C), wrote a proposal in March 1989 for what would eventually become the World Wide Web. At CERN in Geneva, Switzerland, Berners-Lee and Belgian computer scientist Robert Cailliau proposed in 1990 to use hypertext “… to link and access information of various kinds as a web of nodes in which the user can browse at will”, and they publicly introduced the project in December.

“The World-Wide Web was developed to be a pool of human knowledge, and human culture, which would allow collaborators in remote sites to share their ideas and all aspects of a common project.”

 

History

In the May 1970 issue of Popular Science magazine Arthur C. Clarke was reported to have predicted that satellites would one day “bring the accumulated knowledge of the world to your fingertips” using a console that would combine the functionality of the Xerox, telephone, television and a small computer, allowing data transfer and video conferencing around the globe.

In March 1989, Tim Berners-Lee wrote a proposal that referenced ENQUIRE, a database and software project he had built in 1980, and described a more elaborate information management system.

With help from Robert Cailliau, he published a more formal proposal (on November 12, 1990) to build a “Hypertext project” called “WorldWideWeb” (one word, also “W3”) as a “web” of “hypertext documents” to be viewed by “browsers” using a client–server architecture. This proposal estimated that a read-only web would be developed within three months and that it would take six months to achieve “the creation of new links and new material by readers, [so that] authorship becomes universal” as well as “the automatic notification of a reader when new material of interest to him/her has become available.” While the read-only goal was met, accessible authorship of web content took longer to mature, with the wiki concept, blogs, Web 2.0 and RSS/Atom.

The proposal was modeled after the Dynatext SGML reader by Electronic Book Technology, a spin-off from the Institute for Research in Information and Scholarship at Brown University. The Dynatext system, licensed by CERN, was technically advanced and was a key player in the extension of SGML ISO 8879:1986 to Hypermedia within HyTime, but it was considered too expensive and had an inappropriate licensing policy for use in the general high energy physics community, namely a fee for each document and each document alteration.

This NeXT Computer used by Tim Berners-Lee at CERN became the first web server

The CERN datacenter in 2010 housing some www servers

A NeXT Computer was used by Berners-Lee as the world’s first web server and also to write the first web browser, WorldWideWeb, in 1990. By Christmas 1990, Berners-Lee had built all the tools necessary for a working Web: the first web browser (which was a web editor as well); the first web server; and the first web pages, which described the project itself. On August 6, 1991, he posted a short summary of the World Wide Web project on the alt.hypertext newsgroup. This date also marked the debut of the Web as a publicly available service on the Internet. The first photo on the web was uploaded by Berners-Lee in 1992, an image of the CERN house band Les Horribles Cernettes.

Web as a “Side Effect” of the 40 years of Particle Physics Experiments.It happened many times during history of science that the most impressive results of large scale scientific efforts appeared far away from the main directions of those efforts… After the World War 2 the nuclear centers of almost all developed countries became the places with the highest concentration of talented scientists. For about four decades many of them were invited to the international CERN’s Laboratories. So specific kind of the CERN’s intellectual “entire culture” (as you called it) was constantly growing from one generation of the scientists and engineers to another. When the concentration of the human talents per square foot of the CERN’s Labs reached the critical mass, it caused an intellectual explosion The Web – crucial point of human’s history – was born… Nothing could be compared to it… We cant imagine yet the real scale of the recent shake, because there has not been so fast growing multi-dimension social-economic processes in human history…

The first server outside Europe was set up at SLAC to host the SPIRES-HEP database. Accounts differ substantially as to the date of this event. The World Wide Web Consortium says December 1992, whereas SLAC itself claims 1991. This is supported by a W3C document entitled A Little History of the World Wide Web.

The crucial underlying concept of hypertext originated with older projects from the 1960s, such as the Hypertext Editing System (HES) at Brown University, Ted Nelson‘s Project Xanadu, and Douglas Engelbart‘s oN-Line System (NLS). Both Nelson and Engelbart were in turn inspired by Vannevar Bush‘s microfilm-based “memex“, which was described in the 1945 essay “As We May Think“.

Berners-Lee’s breakthrough was to marry hypertext to the Internet. In his book Weaving The Web, he explains that he had repeatedly suggested that a marriage between the two technologies was possible to members of both technical communities, but when no one took up his invitation, he finally tackled the project himself. In the process, he developed three essential technologies:

  1. a system of globally unique identifiers for resources on the Web and elsewhere, the Universal Document Identifier (UDI), later known as Uniform Resource Locator (URL) and Uniform Resource Identifier (URI);
  2. the publishing language HyperText Markup Language (HTML);
  3. the Hypertext Transfer Protocol (HTTP).

The World Wide Web had a number of differences from other hypertext systems that were then available. The Web required only unidirectional links rather than bidirectional ones. This made it possible for someone to link to another resource without action by the owner of that resource. It also significantly reduced the difficulty of implementing web servers and browsers (in comparison to earlier systems), but in turn presented the chronic problem of link rot. Unlike predecessors such as HyperCard, the World Wide Web was non-proprietary, making it possible to develop servers and clients independently and to add extensions without licensing restrictions. On April 30, 1993, CERN announced that the World Wide Web would be free to anyone, with no fees due. Coming two months after the announcement that the server implementation of the Gopher protocol was no longer free to use, this produced a rapid shift away from Gopher and towards the Web. An early popular web browser was ViolaWWW for Unix and the X Windowing System.

Scholars generally agree that a turning point for the World Wide Web began with the introduction of the Mosaic web browser in 1993.A graphical browser developed by a team at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign (NCSA-UIUC), led by Marc Andreessen, funding for Mosaic came from the U.S. High-Performance Computing and Communications Initiative and the High Performance Computing and Communication Act of 1991, one of several computing developments initiated by U.S. Senator Al Gore. Prior to the release of Mosaic, graphics were not commonly mixed with text in web pages and the Web’s popularity was less than older protocols in use over the Internet, such as Gopher and Wide Area Information Servers (WAIS). Mosaic’s graphical user interface allowed the Web to become, by far, the most popular Internet protocol.

The World Wide Web Consortium (W3C) was founded by Tim Berners-Lee after he left the European Organization for Nuclear Research (CERN) in October, 1994. It was founded at the Massachusetts Institute of Technology Laboratory for Computer Science (MIT/LCS) with support from the Defense Advanced Research Projects Agency (DARPA), which had pioneered the Internet; a year later, a second site was founded at INRIA (a French national computer research lab) with support from the European Commission DG InfSo; and in 1996, a third continental site was created in Japan at Keio University. By the end of 1994, while the total number of websites was still minute compared to present standards, quite a number of notable websites were already active, many of which are the precursors or inspiration for today’s most popular services.

Connected by the existing Internet, other websites were created around the world, adding international standards for domain names and HTML. Since then, Berners-Lee has played an active role in guiding the development of web standards (such as the markup languages in which web pages are composed), and in recent years has advocated his vision of a Semantic Web. The World Wide Web enabled the spread of information over the Internet through an easy-to-use and flexible format. It thus played an important role in popularizing use of the Internet. Although the two terms are sometimes conflated in popular use, World Wide Web is not synonymous with Internet. The Web is a collection of documents and both client and server software using Internet protocols such as TCP/IP and HTTP.

 

Function

The terms Internet and World Wide Web are often used in every-day speech without much distinction. However, the Internet and the World Wide Web are not one and the same. The Internet is a global system of interconnected computer networks. In contrast, the Web is one of the services that runs on the Internet. It is a collection of textual documents and other resources, linked by hyperlinks and URLs, transmitted by web browsers and web servers. In short, the Web can be thought of as an application “running” on the Internet.

Viewing a web page on the World Wide Web normally begins either by typing the URL of the page into a web browser, or by following a hyperlink to that page or resource. The web browser then initiates a series of communication messages, behind the scenes, in order to fetch and display it. As an example, consider the Wikipedia page for this article with the URL

http://en.wikipedia.org/wiki/World_Wide_Web .

First, the browser resolves the server-name portion of the URL  into an Internet Protocol address using the globally distributed database known as the Domain Name System (DNS); this lookup returns an IP address such as 208.80.152.2. The browser then requests the resource by sending an HTTP request across the Internet to the computer at that particular address. It makes the request to a particular application port in the underlying Internet Protocol Suite so that the computer receiving the request can distinguish an HTTP request from other network protocols it may be servicing such as e-mail delivery; the HTTP protocol normally uses port 80. The content of the HTTP request can be as simple as the two lines of text

GET /wiki/World_Wide_Web HTTP/1.1

Host: en.wikipedia.org

The computer receiving the HTTP request delivers it to Web server software listening for requests on port 80. If the web server can fulfill the request it sends an HTTP response back to the browser indicating success, which can be as simple as

HTTP/1.0 200 OK

Content-Type: text/html; charset=UTF-8

followed by the content of the requested page. The Hypertext Markup Language for a basic web page looks like

<html>

<head>

<title>World Wide Web — Wikipedia, the free encyclopedia</title>

</head>

<body>

<p>The ”’World Wide Web”’, abbreviated as ”’WWW”’ and commonly known …</p>

</body>

</html>

The web browser parses the HTML, interpreting the markup (<title>, <b> for bold, and such) that surrounds the words in order to draw that text on the screen.

Many web pages consist of more elaborate HTML which references the URLs of other resources such as images, other embedded media, scripts that affect page behavior, and Cascading Style Sheets that affect page layout. A browser that handles complex HTML will make additional HTTP requests to the web server for these other Internet media types. As it receives their content from the web server, the browser progressively renders the page onto the screen as specified by its HTML and these additional resources.

 

Linking

Most web pages contain hyperlinks to other related pages and perhaps to downloadable files, source documents, definitions and other web resources (this Wikipedia article is full of hyperlinks). In the underlying HTML, a hyperlink looks like Graphic representation of a minute fraction of the WWW, demonstrating hyperlinks Such a collection of useful, related resources, interconnected via hypertext links is dubbed a web of information. Publication on the Internet created what Tim Berners-Lee first called the WorldWideWeb (in its original CamelCase, which was subsequently discarded) in November 1990.

The hyperlink structure of the WWW is described by the webgraph: the nodes of the webgraph correspond to the webpages (or URLs) the directed edges between them to the hyperlinks.

Over time, many web resources pointed to by hyperlinks disappear, relocate, or are replaced with different content. This makes hyperlinks obsolete, a phenomenon referred to in some circles as link rot and the hyperlinks affected by it are often called dead links. The ephemeral nature of the Web has prompted many efforts to archive web sites. The Internet Archive, active since 1996, is one of the best-known efforts.

 

Dynamic updates of web pages

JavaScript is a scripting language that was initially developed in 1995 by Brendan Eich, then of Netscape, for use within web pages. The standardized version is ECMAScript. To overcome some of the limitations of the page-by-page model described above, some web applications also use Ajax (asynchronous JavaScript and XML). JavaScript is delivered with the page that can make additional HTTP requests to the server, either in response to user actions such as mouse-clicks, or based on lapsed time. The server’s responses are used to modify the current page rather than creating a new page with each response. Thus the server only needs to provide limited, incremental information. Since multiple Ajax requests can be handled at the same time, users can interact with a page even while data is being retrieved. Some web applications regularly poll the server to ask if new information is available.

 

WWW prefix

Many domain names used for the World Wide Web begin with www because of the long-standing practice of naming Internet hosts (servers) according to the services they provide. The hostname for a web server is often www, in the same way that it may be ftp for an FTP server, and news or nntp for a USENET news server. These host names appear as Domain Name System (DNS) subdomain names, as in http://www.example.com. The use of ‘www’ as a subdomain name is not required by any technical or policy standard; indeed, the first ever web server was called nxoc01.cern.ch, and many web sites exist without it. Many established websites still use ‘www’, or they invent other subdomain names such as ‘www2’, ‘secure’, etc. Many such web servers are set up such that both the domain root (e.g., example.com) and the www subdomain (e.g., http://www.example.com) refer to the same site; others require one form or the other, or they may map to different web sites.

The use of a subdomain name is useful for load balancing incoming web traffic by creating a CNAME record that points to a cluster of web servers. Since, currently, only a subdomain can be cname’ed the same result cannot be achieved by using the bare domain root.

When a user submits an incomplete website address to a web browser in its address bar input field, some web browsers automatically try adding the prefix “www” to the beginning of it and possibly “.com”, “.org” and “.net” at the end, depending on what might be missing. For example, entering ‘microsoft’ may be transformed to http://www.microsoft.com/ and ‘openoffice’ to http://www.openoffice.org. This feature started appearing in early versions of Mozilla Firefox, when it still had the working title ‘Firebird’ in early 2003, from a much more ancient practice in browsers such as Lynx. It is reported that Microsoft was granted a US patent for the same idea in 2008, but only for mobile devices.

In English, www is pronounced by individually pronouncing the name of characters (double-u double-u double-u). Although some technical users pronounce it dub-dub-dub, this is not widespread. The English writer Douglas Adams once quipped in The Independent on Sunday (1999): “The World Wide Web is the only thing I know of whose shortened form takes three times longer to say than what it’s short for,” with Stephen Fry later pronouncing it in his “Podgrammes” series of podcasts as “wuh wuh wuh.” In Mandarin Chinese, World Wide Web is commonly translated via a phono-semantic matching to wàn wéi wǎng (万维网), which satisfies www and literally means “myriad dimensional net”, a translation that very appropriately reflects the design concept and proliferation of the World Wide Web. Tim Berners-Lee’s web-space states that World Wide Web is officially spelled as three separate words, each capitalized, with no intervening hyphens.

Use of the www prefix is declining as Web 2.0 web applications seek to brand their domain names and make them easily pronounceable. As the mobile web grows in popularity, services like Gmail.com, MySpace.com, Facebook.com and Twitter.com are most often discussed without adding the www to the domain.

 

http and https specifiers

The scheme specifiers (http:// or https://) in URIs refer to the Hypertext Transfer Protocol and to HTTP Secure respectively and so define the communication protocol to be used for the request and response. The HTTP protocol is fundamental to the operation of the World Wide Web; the added encryption layer in HTTPS is essential when confidential information such as passwords or banking information are to be exchanged over the public Internet. Web browsers usually prepend the scheme to URLs too, if omitted.

 

Privacy

Computer users, who save time and money, and who gain conveniences and entertainment, may or may not have surrendered the right to privacy in exchange for using a number of technologies including the Web. For example: more than a half billion people worldwide have used a social network service, and of Americans who grew up with the Web, half created an online profile and are part of a generational shift that could be changing norms. The social network Facebook progressed from U.S. college students to a 70% non-U.S. audience, but in 2009 estimated that only 20% of its members use privacy settings. In 2010 (six years after co-founding the company), Mark Zuckerberg wrote, “we will add privacy controls that are much simpler to use”.

Privacy representatives from 60 countries have resolved to ask for laws to complement industry self-regulation, for education for children and other minors who use the Web, and for default protections for users of social networks. They also believe data protection for personally identifiable information benefits business more than the sale of that information. Users can opt-in to features in browsers to clear their personal histories locally and block some cookies and advertising networks but they are still tracked in websites’ server logs, and particularly web beacons. Berners-Lee and colleagues see hope in accountability and appropriate use achieved by extending the Web’s architecture to policy awareness, perhaps with audit logging, reasoners and appliances.

In exchange for providing free content, vendors hire advertisers who spy on Web users and base their business model on tracking them. Since 2009, they buy and sell consumer data on exchanges (lacking a few details that could make it possible to de-anonymize, or identify an individual). Hundreds of millions of times per day, Lotame Solutions captures what users are typing in real time, and sends that text to OpenAmplify who then tries to determine, to quote a writer at The Wall Street Journal, “what topics are being discussed, how the author feels about those topics, and what the person is going to do about them”.

Microsoft backed away in 2008 from its plans for strong privacy features in Internet Explorer, leaving its users (50% of the world’s Web users) open to advertisers who may make assumptions about them based on only one click when they visit a website. Among services paid for by advertising, Yahoo! could collect the most data about users of commercial websites, about 2,500 bits of information per month about each typical user of its site and its affiliated advertising network sites. Yahoo! was followed by MySpace with about half that potential and then by AOLTimeWarner, Google, Facebook, Microsoft, and eBay.

 

Security

The Web has become criminals’ preferred pathway for spreading malware. Cybercrime carried out on the Web can include identity theft, fraud, espionage and intelligence gathering. Web-based vulnerabilities now outnumber traditional computer security concerns, and as measured by Google, about one in ten web pages may contain malicious code. Most Web-based attacks take place on legitimate websites, and most, as measured by Sophos, are hosted in the United States, China and Russia. The most common of all malware threats is SQL injection attacks against websites. Through HTML and URIs the Web was vulnerable to attacks like cross-site scripting (XSS) that came with the introduction of JavaScript and were exacerbated to some degree by Web 2.0 and Ajax web design that favors the use of scripts. Today by one estimate, 70% of all websites are open to XSS attacks on their users.

Proposed solutions vary to extremes. Large security vendors like McAfee already design governance and compliance suites to meet post-9/11 regulations, and some, like Finjan have recommended active real-time inspection of code and all content regardless of its source. Some have argued that for enterprise to see security as a business opportunity rather than a cost center, “ubiquitous, always-on digital rights management” enforced in the infrastructure by a handful of organizations must replace the hundreds of companies that today secure data and networks. Jonathan Zittrain has said users sharing responsibility for computing safety is far preferable to locking down the Internet.

 

Standards

Many formal standards and other technical specifications and software define the operation of different aspects of the World Wide Web, the Internet, and computer information exchange. Many of the documents are the work of the World Wide Web Consortium (W3C), headed by Berners-Lee, but some are produced by the Internet Engineering Task Force (IETF) and other organizations.

Usually, when web standards are discussed, the following publications are seen as foundational:

Additional publications provide definitions of other essential technologies for the World Wide Web, including, but not limited to, the following:

  • Uniform Resource Identifier (URI), which is a universal system for referencing resources on the Internet, such as hypertext documents and images. URIs, often called URLs, are defined by the IETF’s RFC 3986 / STD 66: Uniform Resource Identifier (URI): Generic Syntax, as well as its predecessors and numerous URI scheme-defining RFCs;
  • HyperText Transfer Protocol (HTTP), especially as defined by RFC 2616: HTTP/1.1 and RFC 2617: HTTP Authentication, which specify how the browser and server authenticate each other.

 

Accessibility

Access to the Web is for everyone regardless of disability—including visual, auditory, physical, speech, cognitive, and neurological. Accessibility features also help others with temporary disabilities like a broken arm or the aging population as their abilities change. The Web is used for receiving information as well as providing information and interacting with society, making it essential that the Web be accessible in order to provide equal access and equal opportunity to people with disabilities. Tim Berners-Lee once noted, “The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect.” Many countries regulate web accessibility as a requirement for websites. International cooperation in the W3C Web Accessibility Initiative led to simple guidelines that web content authors as well as software developers can use to make the Web accessible to persons who may or may not be using assistive technology.

 

Internationalization

The W3C Internationalization Activity assures that web technology will work in all languages, scripts, and cultures. Beginning in 2004 or 2005, Unicode gained ground and eventually in December 2007 surpassed both ASCII and Western European as the Web’s most frequently used character encoding. Originally RFC 3986 allowed resources to be identified by URI in a subset of US-ASCII. RFC 3987 allows more characters—any character in the Universal Character Set—and now a resource can be identified by IRI in any language.

 

Statistics

Between 2005 and 2010, the number of Web users doubled, and was expected to surpass two billion in 2010. According to a 2001 study, there were a massive number, over 550 billion, of documents on the Web, mostly in the invisible Web, or Deep Web. A 2002 survey of 2,024 million Web pages determined that by far the most Web content was in English: 56.4%; next were pages in German (7.7%), French (5.6%), and Japanese (4.9%). A more recent study, which used Web searches in 75 different languages to sample the Web, determined that there were over 11.5 billion Web pages in the publicly indexable Web as of the end of January 2005. As of March 2009[update], the indexable web contains at least 25.21 billion pages. On July 25, 2008, Google software engineers Jesse Alpert and Nissan Hajaj announced that Google Search had discovered one trillion unique URLs. As of May 2009[update], over 109.5 million websites operated. Of these 74% were commercial or other sites operating in the .com generic top-level domain.

Statistics measuring a website’s popularity are usually based either on the number of page views or associated server ‘hits‘ (file requests) that it receives.

 

Speed issues

Frustration over congestion issues in the Internet infrastructure and the high latency that results in slow browsing has led to a pejorative name for the World Wide Web: the World Wide Wait. Speeding up the Internet is an ongoing discussion over the use of peering and QoS technologies. Other solutions to reduce the congestion can be found at W3C. Guidelines for Web response times are:

  • 0.1 second (one tenth of a second). Ideal response time. The user doesn’t sense any interruption.
  • 1 second. Highest acceptable response time. Download times above 1 second interrupt the user experience.
  • 10 seconds. Unacceptable response time. The user experience is interrupted and the user is likely to leave the site or system.

 

Caching

If a user revisits a Web page after only a short interval, the page data may not need to be re-obtained from the source Web server. Almost all web browsers cache recently obtained data, usually on the local hard drive. HTTP requests sent by a browser will usually only ask for data that has changed since the last download. If the locally cached data are still current, it will be reused. Caching helps reduce the amount of Web traffic on the Internet. The decision about expiration is made independently for each downloaded file, whether image, stylesheet, JavaScript, HTML, or whatever other content the site may provide. Thus even on sites with highly dynamic content, many of the basic resources only need to be refreshed occasionally. Web site designers find it worthwhile to collate resources such as CSS data and JavaScript into a few site-wide files so that they can be cached efficiently. This helps reduce page download times and lowers demands on the Web server.

There are other components of the Internet that can cache Web content. Corporate and academic firewalls often cache Web resources requested by one user for the benefit of all. (See also Caching proxy server.) Some search engines also store cached content from websites. Apart from the facilities built into Web servers that can determine when files have been updated and so need to be re-sent, designers of dynamically generated Web pages can control the HTTP headers sent back to requesting users, so that transient or sensitive pages are not cached. Internet banking and news sites frequently use this facility. Data requested with an HTTP ‘GET’ is likely to be cached if other conditions are met; data obtained in response to a ‘POST’ is assumed to depend on the data that was POSTed and so is not cached.

Internet Basic Concepts

Introduction

By the turn of the century, information, including access to the Internet, will be the basis for personal, economic, and political advancement. The popular name for the Internet is the information superhighway. Whether you want to find the latest financial news, browse through library catalogs, exchange information with colleagues, or join in a lively political debate, the Internet is the tool that will take you beyond telephones, faxes, and isolated computers to a burgeoning networked information frontier.

The Internet supplements the traditional tools you use to gather information, Data Graphics, News and correspond with other people. Used skillfully, the Internet shrinks the world and brings information, expertise, and knowledge on nearly every subject imaginable straight to your computer.

What is the Internet?

The Internet links are computer networks all over the world so that users can share resources and communicate with each other. Some computers have direct access to all the facilities on the Internet such as the universities. And other computers, eg privately-owned ones, have indirect links through a commercial service provider, who offers some or all of the Internet facilities. In order to be connected to Internet, you must go through service suppliers. Many options are offered with monthly rates. Depending on the option chosen, access time may vary.

The Internet is what we call a Meta network, that is, a network of networks that spans the globe. It’s impossible to give an exact count of the number of networks or users that comprise the Internet, but it is easily in the thousands and millions respectively. The Internet employs a set of standardized protocols which allow for the sharing of resources among different kinds of computers that communicate with each other on the network. These standards, sometimes referred to as the Internet Protocol Suite, are the rules that developers adhere to when creating new functions for the Internet.

The Internet is also what we call a distributed system; there are no central archives. Technically, no one runs the Internet. Rather, the Internet is made up of thousands of smaller networks. The Internet thrives and develops as its many users find new ways to create, display and retrieve the information that constitutes the Internet.

History & Development of the Internet

In its infancy, the Internet was originally conceived by the Department of Defense as a way to protect government communications systems in the event of a military strike. The original network, dubbed ARPANet (for the Advanced Research Projects Agency that developed it) evolved into a communications channel among contractors, military personnel, and university researchers who were contributing to ARPA projects.

The network employed a set of standard protocols to create an effective way for these people to communicate and share data with each other.

ARPAnet’s popularity continued to spread among researchers, and in the 1980’s the National Science Foundation, whose NSFNet, linked several high speed computers, took charge of the what had come to be known as the Internet.

By the late 1980’s, thousands of cooperating networks were participating in the Internet.

In 1991, the U.S. High Performance Computing Act established the NREN (National Research & Education Network). NREN’s goal was to develop and maintain high-speed networks for research and education, and to investigate commercial uses for the Internet.

The rest, as they say, is history in the making. The Internet has been improved through the developments of such services as Gopher and the World Wide Web.

Even though the Internet is predominantly thought of as a research oriented network, it continues to grow as an informational, creative, and commercial resource every day and all over the world.

Who Pays for the Internet?

There is no clear answer to this question because the Internet is not one “thing”, it’s many things. No one central agency exists that charges individual Internet users. Rather, individuals and institutions who use the Internet pay a local or regional Internet service provider for their share of services. And in turn, those smaller Internet service providers might purchase services from an even larger network. So basically, everyone who uses the Internet in some way pays for part of it.

What makes the internet work?

The unique thing about the Internet is that it allows many different computers to connect and talk to each other. This is possible because of a set of standards, known as protocols, that govern the transmission of data over the network: TCP/IP (Transmission Control Protocol/Internet Protocol). Most people who use the Internet aren’t so interested in details related to these protocols. They do, however, want to know what they can do on the Internet and how to do it effectively.

The Client/Server Model:

The most popular Internet tools operate as client/server systems. You’re running a program called a Web client. This piece of software displays documents for you and carries out your requests. If it becomes necessary to connect to another type of service–say, to set up a Telnet session, or to download a file–your Web client will take care of this, too. Your Web client connects (or “talks”) to a Web server to ask for information on your behalf.

The Web server is a computer running another type of Web software which provides data, or “serves up” an information resource to your Web client.

All of the basic Internet tools–including Telnet, FTP, Gopher, and the World Wide Web–are based upon the cooperation of a client and one or more servers. In each case, you interact with the client program and it manages the details of how data is presented to you or the way in which you can look for resources. In turn, the client interacts with one or more servers where the information resides. The server receives a request, processes it, and sends a result, without having to know the details of your computer system, because the client software on your computer system is handling those details.

The advantage of the client/server model lies in distributing the work so that each tool can focus or specialize on particular tasks: the server serves information to many users while the client software for each user handles the individual user’s interface and other details of the requests and results.

The Use of Local Clients:

Every computer should be equipped with basic client software packages that allow you to perform functions such as electronic mail, Telnet, Gopher, and FTP.

Electronic mail on the internet:

Electronic mail, or e-mail, is probably the most popular and widely used Internet function. E-mail, email, or just mail, is a fast and efficient way to communicate with friends or colleagues. You can communicate with one person at a time or thousands; you can receive and send files and other information. You can even subscribe to electronic journals and newsletters. You can send an e-mail message to a person in the same building or on the other side of the world.

How does E-mail Work?

E-mail is an asynchronous form of communication, meaning that the person whom you want to read your message doesn’t have to be available at the precise moment you send your message. This is a great convenience for both you and the recipient.

On the other hand, the telephone, which is a synchronous communication medium, requires that both you and your listener be on the line at the same time in order for you to communicate (unless you leave a voice message). It will be impossible to discuss all the details of the many e-mail packages available to Internet users.

Fortunately, however, most of these programs share basic functionality which allow you to:

*send and receive mail messages

*save your messages in a file

*print mail messages

*reply to mail messages

*attach a file to a mail message

Reading an Internet Address:

To use Internet e-mail successfully, you must understand how the names and addresses for computers and people on the Internet are formatted. Mastering this technique is just as important as knowing how to use telephone numbers or postal addresses correctly.

Fortunately, after you get the hang of them, Internet addresses are usually no more complex than phone numbers and postal addresses.

And, like those methods of identifying a person, an organization, or a geographic location–usually by a telephone number or a street address–Internet addresses have rules and conventions for use.

Sample Internet Address: custcare@aucegypt.edu

The Internet address has three parts:

1.a user name [custcare in the example above]

2.an “at” sign (@)

3.the address of the user’s mail server [aucegypt.edu in the example above] Sometimes it’s useful to read an Internet address (like custcare@aucegypt.edu) or a domain name from right to left because it helps you determine information about the source of the address.

An address like 201B6DQF@asu.edu doesn’t tell me much about the person who’s sending me a message, but I can deduce that the sender is affiliated with an educational institution because of the suffix edu.

The right-most segment of domain names usually adhere to the naming conventions listed below:

EDU   Educational sites in the United States

COM  Commercial sites in the United States

GOV  Government sites in the United States

NET   Network administrative organizations

MIL    Military sites in the United States

ORG  Organizations in the U.S. not covered by the categories above (e.g., non-profit orginaizations).

E-mail. Advatages and Disadvateages of E-mail.

E-mail

The abbreviated form of an Electronic mail is ‘E-mail’. E-mail is a system of creating, sending and storing textual data in digital form over a network. Earlier, the e-mail system was based on Simple Mail Transfer Protocol (SMTP) mechanism, a protocol used in sending the e-mails from one server to another. Today’s e-mail technology uses the store-and-forward model. In this model, the users sends and receives information on their own computer terminal. However, the computer is used only for connecting to the e-mail architecture. The creation, transmission and storage of e-mail takes place, only when the connection with this e-mail architecture is established.

E-mail is one of the many technological developments that has influenced our lives. It has changed the medium of communication. So, it becomes necessary for us to check out the benefits and harmful effects of this popular tool used on the Internet.

Advantages of Email
The benefits of e-mail are huge in number.

  • Easy to use: E-mail frees us from the tedious task of managing data of daily use. It helps us to manage our contacts, send mails quickly, maintain our mail history, store the required information, etc.
  • Speed: The e-mail is delivered instantly, anywhere across the globe. No other service matches the e-mail in terms of speed.
  • Easy to prioritize: Since the mails have subject lines, it is easy to prioritize them and ignore unwanted mails.
  • Reliable and secure: Constant efforts are being taken to improve the security in electronic mails. Thus making it one of the secured ways of communication.
  • Informal and conversational: The language used in e-mails is generally simple and thus makes the communication informal. Sending and receiving e-mails takes less time, so it can be used as a tool for interaction.
  • Easier for reference: When one needs to reply to a mail, there is a provision in the mailing system to attach the previous mails as references. This refreshes the recipient’s knowledge, on what he is reading.
  • Automated e-mails: It is possible to send automated e-mails using special programs like the autoresponders. The autoresponders reply back to the sender with generalized pre-written text messages.
  • Environment friendly: Postal mails use paper as a medium to send letters. Electronic mail thus, saves a lot of trees from being axed. It also saves fuel needed in transportation.
  • Use of graphics: Colorful greeting cards and interesting pictures can be sent through e-mails. This adds value to the e-mail service.
  • Advertising tool: Many individuals and companies are using e-mails to advertise their products, services, etc.

Disadvantages of Email
The e-mails, though beneficial in our day-to-day life, has got its own drawbacks that are off late coming to the fore.

  • Viruses: These are computer programs having the potential to harm a computer system. These programs copy themselves and further infect the computer. The recipient needs to scan the mails, as viruses are transmitted through them and have the potential to harm computer systems.
  • Spam: E-mails when used to send unsolicited messages and unwanted advertisements create nuisance and is termed as Spam. Checking and deleting these unwanted mails can unnecessarily consume a lot of time, and it has become necessary to block or filter the unwanted e-mails by means of spam filters. Spamming includes, sending hoax e-mails. E-mail spoofing is another common practice, used for spamming. Spoofing involves deceiving the recipient by altering the e-mail headers or the addresses from which the mail is sent.
  • Hacking: The act of breaking into computer security is termed as hacking. After the e-mail is sent and before it is received by the desired recipient, it “bounces” between servers located in different parts of the world. Hence, the e-mail can be hacked by a professional hacker.
  • Misinterpretation: One has to be careful while posting any kind of content through an e-mail. If typed in a hurry, the matter could be misinterpreted.
  • Lengthy mails: If the mail is too long and not properly presented the reader may lose interest in reading it.
  • Not suitable for business: Since the content posted via e-mails is considered informal, there is a chance of business documents going unnoticed. Thus, urgent transactions and especially those requiring signatures are not managed through e-mails.
  • Crowded inbox: Over a period of time, the e-mail inbox may get crowded with mails. It becomes difficult for the user to manage such a huge chunk of mails.
  • Need to check the inbox regularly: In order to be updated, one has to check his e-mail account regularly.

 

 

                   Parts of an email message

An email message consists of the following general components:

Headers

The message headers contain information concerning the sender and recipients. The exact content of mail headers can vary depending on the email system that generated the message. Generally, headers contain the following information:

  • Subject. This is what appears in most email systems that list email messages individually. A subject line could be something like “2005 company mission statement” or, if your spam filtering application is too lenient, “Lose weight fast!!! Ask me how.”
  • Sender (From). This is the senders Internet email address. It is usually presumed to be the same as the Reply-to address, unless a different one is provided.
  • Date and time received (On). The time the message was received.
  • Reply-to. This is the Internet email address that will become the recipient of your reply if you click the Reply button.
  • Recipient (To:). First/last name of email recipient, as configured by the sender.
  • Recipient email address. The Internet mail address of the recipient, or where the message was actually sent.

Body

The body of a message contains text that is the actual content, such as “Employees who are eligible for the new health care program should contact their supervisors by next Friday if they want to switch.”  The message body also may include signatures or automatically generated text that is inserted by the sender’s email system.

AttachmentsThese are optional and include any separate files that may be part of the mess

Internet Mail Protocols

Internet Mail Protocols

There are currently three main Internet Email Protocols, SMTP, POP3, and IMAP4. Also there is ODMR which is a variation of SMTP. (There are several other mail protocols, eg UUCP which are currently less widely used, and aren’t currently supported by VPOP3). IMAP4 is only supported by VPOP3 Enterprise, not by VPOP3 Standard.

SMTP Protocol

SMTP (Simple Mail Transfer Protocol) is really intended for permanent connections to the Internet. The SMTP ‘client’ connects to an SMTP ‘server’ to send a message. There is no way to request a specific message using SMTP, but there are extensions to request a server to start sending any messages it has.

When messages are sent using SMTP it is sent in two parts:

1) An Envelope – this contains the email address it was sent from (typically for error reports) and a list of people to receive the message. This is not normally seen by users.

2) The message Data – this contains the message that you typically see.

The Envelope may contain a copy of the information in the Data’s From: and To: header fields, but it may contain other information which is not contained in the message at all (for instance, for mailing list messages or if BCC addressing is used).

If you have an SMTP account with your Internet Provider, then you need to run an SMTP ‘server’ (e.g. VPOP3) on your PC, and then somehow tell your Internet Provider to start sending messages to it after you’ve connected. Some Internet Providers can automatically detect you dialing into them, and immediately start sending mail to your SMTP server. Other Internet Providers need you to issue an SMTP command such as ETRN to their server to trigger mail delivery.

A few Internet Providers require a non-standard action, such as sending a ‘Finger’ command  to their server to trigger mail delivery. You can use an external program, set as the VPOP3 Post-Connection Extension to issue this command.

Once SMTP mail delivery has started there is really no way for the server to reject messages based on any other criteria apart from the data contained in the Envelope (i.e. From or To addresses). Also, once the message has been sent to an SMTP server, the client typically discards its copy.

This means that if VPOP3 is running as your SMTP server to receive mail from your Internet Provider, some useful features such as being able to limit message download size, the Download Rules, leaving messages on the ISP server etc, cannot be used. VPOP3’s SMTP Rules can perform many actions you may require, but they are not as flexible as Download Rules, because VPOP3 cannot see the message header without receiving the entire message.

When sending messages to another site, there are really two ways of doing this, both of which are typically accepted:

The mail server can send the message directly to the destination site’s mail server

The mail server can send the message to a relay-server which then sends the message to the destination site.

We recommend the use of the second method. Some of the reasons for this are:

It is easier to configure, and fits in with most users’ understanding of how mail works.

It is a lot quicker over a dial-up connection. If the first method is used, then any message to more than one recipient typically has to be sent multiple times (once for each recipient). There is also a lot of querying of DNS servers which can be time consuming.

Many Internet Providers (eg AOL) will reject mail that comes directly from a computer on a dial-up connection, as an anti-spam protection method.

 

ODMR Protocol

The ODMR (On Demand Mail Relay) protocol is a variation on SMTP which has been designed to allow SMTP mail delivery to a dynamic IP address. With ODMR, the ODMR client (VPOP3) connects to an ODMR server (at the ISP), logs on with the ISP account details, and from then on acts identically to an SMTP server (so the ODMR client becomes the SMTP server, and the ODMR server becomes the SMTP client).

This allows the advantages of SMTP without requiring a static IP address.

POP3 Protocol

POP3 (Post Office Protocol Version 3) was created for dial-up Internet accounts because of the limitations with the SMTP protocol. When collecting mail from an ISP using POP3, the ‘client’ is the PC at the user’s end, and it is in total control of what messages it receives and which ones it doesn’t.

The POP3 client can also typically view message headers without downloading the entire message, see the message size before downloading it, delete messages without downloading them, or leave messages on the server after downloading them. Because of these POP3 facilities, VPOP3 can do a lot more to help you.

One of the normal problems which occurs if multiple email addresses are directed to a single POP3 mailbox is that the SMTP Envelope (see above) is lost when the message is placed in the mailbox. This means that the explicit message routing information is lost, and all that VPOP3 has to go on is the data in the message headers (eg To, Cc etc). This can cause problems if you receive messages from mailing lists or which include Bccs. Some ISPs get around these problems by extending the POP3 protocol (eg Demon Internet Services do this) or adding special message header fields which VPOP3 can use if you tell it about them.

 

IMAP4 Protocol

IMAP4 (Internet Mail Access Protocol V4) is an email protocol which is sometimes used instead of the POP3 protocol. With IMAP4 email is stored on the mail server and can be accessed from any IMAP4 email client on the network. With POP3 email is downloaded to the mail client where it is accessed.

When using IMAP4 many of the functions of the email client are performed by the mail server instead. This includes things such as searching for messages, moving messages between folders etc.

In most cases the user will not know any difference between using IMAP4 or using POP3, so use whichever is more appropriate.

IMAP4 has some advantages in some situations:

  1. Because all email is stored on the mail server it is easier to backup all email in one batch.
  2. Users can access their email from anywhere on the network, so if your users do not have a fixed computer to use, IMAP4 can be the solution
  3. Users can share mailboxes. The IMAP4 protocol allows several people to log onto a mailbox at once to read messages. This can be useful for ‘noticeboard’ type applications. Access Control Lists allow you to restrict which users can do which tasks in a mailbox.

There are some disadvantages to using IMAP4:

  1. Because all mail has to be transferred over the network as it is read it can be slower than reading mail from the local PC where it has been downloaded to using POP3
  2. The load on the network is usually much more. Mail is transferred every time it is read, so if a user reads a message at 10 different times, the message is transferred over the network 10 times rather than just once.

 

  1. The load on the mail server is much more than with the POP3 protocol. Searching for messages etc will increase the load even more
  2. The mail server needs much more mail storage space. When using POP3 the mail is stored on a user’s own PC so the server is not usually affected by large amounts of mail.
  3. IMAP4 was not originally designed for remote users. Some email clients allow offline access to IMAP4 mailboxes but because this is not what the protocol was designed for, it can sometimes be unintuitive. The POP3 protocol was designed for remote users, so it is often more efficient and more intuitive.
  4. The IMAP4 protocol is not supported by the Standard VPOP3 software, only by VPOP3 Enterprise.

 

 

Internet Routing Protocol

Internet Routing

How does Internet routing work? IP addresses and packet switching provide the technical infrastructure which routing protocols use to transmit packets across the Internet. The Internet Protocol transfers packets between networks and provides the software bridge that knits the whole thing together.

Robert Kahn and Vinton Cerf invented the basic architecture of Internet routing along with their development of the TCP/IP networking protocol

TYPES OF INTERNET ROUTING PROTOCOLS;

Interior Gateway Protocols (IGP)

Exterior Gateway Protocol (EGP)

INTERIOR GATEWAY PROTOCOLS

Interior Gateway protocols (IGP) are used to route Internet communications within a local area network, such as within an office building. The two main types of IGP protocols are described in the following sections, along with an example proprietary protocol for comparison purposes.

Routing Information Protocol (RIP)

Open Shortest Path First (OSPF)

Interior Gateway Routing Protocol (IGRP).

 

ROUTING INFORMATION PROTOCOL

The Routing Information Protocol (RIP) provides the standard IGP protocol for local area networks, and provides great network stability, guaranteeing that if one network connection goes down the network can quickly adapt to send packets through another connection. All RIP routing protocols are based on a distance vector algorithm called the Bellman-Ford algorithm, after Bellman’s development of the equation used as the basis of dynamic programming, and Ford’s early work in the area.

What makes RIP work is a routing database that stores information on the fastest route from computer to computer, an update process that enables each router to tell other routers which route is the fastest from its point of view, and an update algorithm that enables each router to update its database with the fastest route communicated from neighboring routers.

OPEN SHORTEST PATH FIRST

Open Shortest Path First (OSPF) is a particularly efficient IGP routing protocol that is faster than RIP, but also more complex.

The OSPF routing algorithm was created to provide an alternative to RIP, based on Shortest Path First algorithms instead of the Bellman-Ford algorithm. It uses a tree that describes the network topology to define the shortest path from each router to each destination address. Since OSPF keeps track of entire paths, it has more overhead than RIP, but provides more options. The main difference between OSPF and RIP is that RIP only keeps track of the closest router for each destination address, while OSPF keeps track of a complete topological database of all connections in the local network..

Interior Gateway Routing Protocol (IGRP)

Enhanced IGRP uses the same distance vector algorithm and distance information as IGRP. However, the convergence properties and the operating efficiency of enhanced IGRP have improved significantly.

The convergence technology is based on research conducted at SRI International and employs an algorithm referred to as the Diffusing Update Algorithm (DUAL). This algorithm guarantees loop-free operation at every instant throughout a route computation and allows all routers involved in a topology change to synchronize at the same time. Routers that are not affected by topology changes are not involved in re-computations.

The convergence time with DUAL rivals that of any other existing routing protocol. The initial implementation of IGRP operated in Internet Protocol (IP) networks. Enhanced IGRP extends IGRP so that it is independent of the network-layer protocol. In addition to IP, it now also operates in AppleTalk and Novell IPX networks.

Exterior Gateway Protocols (EGP)

While IGP protocols are used within local networks, Exterior Gateway Protocols (EGP) are used for routing between networks, generally on the Internet backbone itself, linking the different networks together. The following sections provide more information on the two common EGP protocols:

Border Gateway Protocol (BGP)

Exterior Gateway Protocol (EGP).

Border Gateway Protocol (BGP)

The most common Exterior Gateway Protocol in use on the Internet is the Border Gateway Protocol (BGP), ensuring that packets get to their destination network regardless of current network conditions.

Like RIP, the BGP algorithm provides great network stability, guaranteeing that if one Internet network line goes down, BGP routers can quickly adapt to send packets through another connection.

When a BGP router first comes up on the Internet, either for the first time or after being turned off, it establishes connections with the other BGP routers with which it directly communicates. The first thing it does is download the entire routing table of each neighboring router. After that it only exchanges much shorter update messages with other routers.

BGP routers send and receive update messages to indicate a change in the preferred path to reach a computer with a given IP address. If the router decides to update its own routing tables because this new path is better, then it will subsequently propagate this information to all of the other neighboring BGP routers to which it is connected, and they will in turn decide whether to update their own tables and propagate the information further.

Exterior Gateway Protocol (EGP)

This protocol is used throughout the 1980’s and into the mid-1990 was also somewhat confusingly named EGP. However, the EGP protocol had several problems, most notably an inability to scale up to support the growth in the size of the Internet Gateway Protocol (EGP).