IB DP Computer Science Option C: Web science -: C.3 – Distributed approaches to the web SL Paper 2

Question

The World Wide Web can be divided into three categories: the surface web, the dark web and the deep web.
The dark web is only accessible by using specialist software, such as TOR and I2P. Many users of the dark web use it to protect their anonymity.
Many users of the dark web use peer-2-peer (P2P) networks for activities like torrent streaming. This opens up ports on the computer to upload and download data.
a. Distinguish between the surface web and the deep web.[2]
b. Explain how a user’s anonymity can be maintained while accessing the dark web.$[3]$
c. Explain why users have concerns about opening up ports to upload and download data.$[3]$
d. The founders of the World Wide Web intended it to be a decentralized and democratic environment.[6]
To what extent have the aspirations of the founders of the World Wide Web been met?

▶️Answer/Explanation

Ans:

a. )
The surface web is the part of the web that can be reached by a search engine whereas the deep web cannot;
The deep web may consist of dynamic content such as the result of database queries or may be protected proprietary content;

b. )
The dark web uses a layered encryption system;
Data is routed through a large number of intermediate servers;
Which means it is almost impossible to decrypt the information layer by layer;
With the result that the user’s details are practically untraceable, and their anonymity can be maintained;

c. )
A port is used to facilitate the communication between a computer and an application;
Certain ports such as Port 21 (FTP), 23 (Telnet) and 80 (HTTP) are reserved;
Every time a port is opened on a computer it provides access to that computer;
This means that the security of that computer may be potentially compromised every time a new port is opened;
Or port conflicts may occur when more than one application tries to use a specified port;

d. )
World Wide Web has enabled citizens to communicate easily and for ‘ordinary’ citizens to express their opinions;
Therefore, the ability to publish is not confined to certain ‘privileged’ groups such as broadcasters and journalists;
The World Wide Web has to a large degree given access to common resources to all citizens globally who have access to the Internet;
However, it can be argued that the evolution of the World Wide Web has led to a greater centralization of power, for example the digital oligarchs
(Microsoft, Google, Amazon, Apple, Facebook);

This centralization has led to a reduction in democracy as the digital oligarchs have an increasing stranglehold over the lives of their ‘digital subjects’ through the aggregation, analysis and monetarization of their data;
There are still issues with a lack of digital democracy, for example, many citizens may not have access to the World Wide Web, either through income, geography or necessary skills;

Question

The internet and World Wide Web are often considered to be the same, or the terms are used in the wrong context.
Many organizations produce computer-based solutions that implement open standards.
A search engine is software that allows a user to search for information. The most commonly used search algorithms are the PageRank and HITS algorithms.
a. Distinguish between the internet and the World Wide Web.[2]
b. Outline two advantages of using open standards.$[4]$
c. Outline why a search engine using the HITS algorithm might produce different page ranking from one using the PageRank algorithm.[2]
d. Web crawlers browse the World Wide Web.[3]
Explain how data stored in a meta-tag is used by a web crawler.

▶️Answer/Explanation

Ans:

a. )
The internet is a global network of interconnected computers / a network of networks;
The World Wide Web is software / a service that runs on the hardware of the internet and provides access to content / a collection of pages that can be accessed through hyperlinks / a way of accessing and sharing the information that is held on the internet in webpages;
The World Wide Web uses the http protocol. This is only one of the many protocols used by the internet;
E-mail, File Transfer Protocol (FTP), and instant messaging services are part of the internet but not of the web;

b. )
Open standards provide a publicly available specification for a specified task;
This is an agreed set of parameters that enable interoperability and/or compatibility to occur;
Using Open standards means that you are not subject to a governing body with its own agenda/self-interest; Thus, you can be confident that you won’t be subject to fees/bias;
Open standards promote interoperability;
This enables the various devices to communicate with each other;
Open standards advocates also argue that openness encourages better and more secure systems;
this is because more people are able to analyse the standards and resulting software and no-one has a proprietary interest in suppressing knowledge of problems to keep sales up.

c. )
The HITS Algorithm ranks the page based on a combination of its importance as a hub and an authority;
The PageRank Algorithm ranks the page by counting the number and quality of links to a page to determine the relative importance of the website;

d.)
Meta tags are included in the header of a web-page which are available to a web-crawler and give information about the page that it could make use of;

When the web-page is crawled, a copy of the HTML is replicated in the search engine database;
When a user enters text into a search the search engine retrieves the data indexed from the web-page;
And the search engine ranks and displays the content (in order of relevance);

Scroll to Top