Home / 2024_May_Computer_science_paper_2__SL

Question 1

Environmental systems and societies (ESS) students are collecting data about the plant species found on sand dunes as part of their internal assessment. The data is collected from 10 sites using a paper form (Figure 1).
The form shown in Figure 1 is used to input data into the Environment database.
(a) State the data type for:
(i) Species
(ii) Gradient
(b) Outline one way that data validation could be carried out on the gradient attribute.
Three of the tables in the Environment database are shown in Figure 2:
(c) Construct an entity relationship diagram (ERD) for the Plant, Site and Distribution tables.
(d) Outline why a composite primary key is used for the Distribution table.
(e) Identify the steps to create a query to calculate the total number of sites where gorse has been found from the samples carried out on 14 October 2019.
(f) Explain how data consistency can be maintained in the Environment database.

Most-appropriate topic codes (IB Computer Science SL):

• Topic A.1 — Basic concepts (Part (f))
• Topic A.2 — The relational database model (Parts (a), (b), (c), (d), (e))

▶️ Answer/Explanation

(a)(i)
For the correct answer:
String / Text (accept also Varchar, Alphanumeric)

The Species field stores the name of a plant, such as “Marram grass” or “Gorse”. Since these are textual descriptions composed of letters and possibly spaces, the most natural and flexible storage type is a string or text data type, which can hold any sequence of characters without needing mathematical operations on it.

(a)(ii)
For the correct answer:
Real (accept also Integer, Float, Number, Numeric)

The Gradient attribute records a numerical measurement like $8.2$ or $7.9$, representing the steepness of a sand dune slope. Because these values include a decimal point and are not whole numbers, a real (or float) data type is required to store the fractional part accurately rather than truncating it.

(b)
For the correct answer (any one):
Range check — to ensure input is between $-90$ and $+90$; Presence check — to ensure a value has been input; Type check — to ensure the input is numeric.

One practical way to validate the gradient attribute is to apply a range check. Since the steepest possible slope on earth cannot exceed $90^\circ$, setting a rule that the gradient value must logically fall between $-90$ and $+90$ will immediately flag any obviously erroneous entries, preventing nonsensical data from ever entering the database.

(c)
For the correct answer:

The ERD must depict three distinct rectangular boxes representing the Plant, Site, and Distribution tables. The line connecting Plant to Distribution should show a one-to-many relationship (one plant species can appear in many distribution records), and similarly the line from Site to Distribution should also show a one-to-many relationship, since each site can host multiple plant distribution entries over time.

(d)
For the correct answer:
Plant_ID and Site_ID can be repeated / not unique — no single attribute in Distribution table can uniquely identify a record; using multiple fields uniquely identifies a tuple/record and removes the need to add a new ID field as Primary Key (which would waste storage space).

In the Distribution table, neither Plant_ID alone nor Site_ID alone can uniquely identify each row because the same plant can be recorded at multiple sites and the same site can host multiple plants. By combining both fields into a composite primary key, every pair becomes unique without needing to invent a separate artificial ID column, keeping the design efficient and semantically meaningful.

(e)
For the correct answer:
Use Plant and Distribution tables; use COUNT on Site_ID; correct species and date conditions; correct condition for connecting the two tables (Plant.Plant_ID = Distribution.Plant_ID).

You would first need to join or link the Plant and Distribution tables using their shared Plant_ID field. Then filter the combined records by setting the Species to “Gorse” and the Date to ’14/10/2019′. Finally, apply the COUNT function on the Site_ID column to tally how many distinct site entries match those criteria, giving the total number of sites where gorse was sampled on that particular day.

(f)
For the correct answer (any three):
Any data written must be valid according to all defined rules; Referential integrity — all foreign keys should have a primary key; Cascades — changing data in one table should be carried through to related tables; Normalise the database to remove/reduce redundant data; Triggers to automatically invoke updates; Validation checks; Verification checks; Data updates use transactions to ensure automatic rollback on failure.

Data consistency means that the information stored across related tables never contradicts itself. This can be maintained by enforcing referential integrity so every foreign key points to an existing primary key, using cascading updates so that a change made in one table automatically propagates to all dependent records, and wrapping critical multi-table operations inside transactions that will roll back entirely if any step fails, leaving the database in a clean, coherent state.

Question 2

The Bucharesti School website allows parents to login and select school transportation for their children. If they select the school bus, they will have to pay for this service at the end of the month.
(a) Identify the steps that take place in a transaction when a parent attempts to pay the school bus at the end of a month.
(b) Explain how the database management system (DBMS) prevents a record being updated by two parents simultaneously.
(c) Identify two roles of the database administrator at Bucharesti School.
(d) Outline two ways that a database management system (DBMS) can be used to ensure the students’ personal data remains secure.
(e) Explain how the developers of the Bucharesti School database can ensure that it has been designed ethically.

Most-appropriate topic codes (IB Computer Science SL):

• Topic A.1 — Basic concepts (Parts (a), (b))
• Topic A.3 — Further aspects of database management (Part (c), (d), (e))

▶️ Answer/Explanation

(a)
For the correct answer:
Parent authenticated by the DBMS; Outstanding amount/bill is calculated/displayed; Transaction is initiated for the transport payment; Payment details added / entered for transport; If payment details and payment can be processed by DBMS, then School transport Account is credited and parent’s account is debited; Transaction is committed; Else Transaction is rolled back; Notification sent to the parent.

A payment transaction is a carefully ordered sequence. First the system authenticates the parent, then it calculates the outstanding bus fee for the month. Once the parent confirms payment, the system attempts to debit their account and credit the school’s transport account simultaneously — if both halves succeed, the entire transaction is committed and a receipt is generated; if either half fails, everything is rolled back so no money is lost or double-charged.

(b)
For the correct answer:
Record locking / Isolation / Data Locking / Row locking — ensures exclusive editing, done in isolation to prevent the same data item from being changed by two different transactions. OR Optimistic Concurrency Control (OCC) / Multi-version concurrency control (MVCC) — allows multiple users to access the unmodified version of data at the same time; on update request, a check is done to see if the existing data has been modified by another user since it was initially read to prevent lost updates.

When two parents might try to book the last seat on the same bus simultaneously, the DBMS uses record locking to give the first transaction exclusive write access to that specific row. The second transaction must wait until the first completes and releases the lock, ensuring that only one parent can successfully update the available seat count, preventing a double-booking scenario.

(c)
For the correct answer (any two):
Approving Data Access / Managing user accounts; Monitoring Performance / Performance tuning; Backup and Recovery; Implementing security; Upgrading/updating the database / Maintenance.

The database administrator at the school would be responsible for managing who gets access to the system by creating and approving user accounts for parents and staff members. They would also be in charge of backup and recovery, ensuring that if the server crashes, all student transport records and payment histories can be restored without losing critical information.

(d)
For the correct answer (two ways, each with outline):
Access controls — effective restrictions so end users can access only the data or programs for which they have legitimate privileges. User accounts — different users have usernames and passwords/biometrics to enable unique login experiences. Data Encryption — nullifies the potential value of data interception and ensures the confidentiality of data.

One strong method is implementing strict access controls so that a parent logging in can only view their own child’s records and never another student’s data. Another essential measure is encrypting all stored personal data, which means even if someone gained unauthorized access to the raw database files, the information would appear as indecipherable gibberish without the correct decryption keys.

(e)
For the correct answer (up to 6 marks):
Privacy considerations must be ensured — prevention of unauthorized access to private data, providing security measures e.g. logins; Encryption of data to ensure it is unusable to unauthorized users; Especially as many of the data subjects will be under 18; Ensuring that the inappropriate use of data cannot take place, for example sharing the data with third parties without the consent of data subjects; Measures to ensure accuracy and completeness when collecting data; Availability of data content, and the data subject’s legal right to access; Ownership rights to inspect, update or correct these data; The data is not available to the vast majority of its users / Views are used to access specific data instead of giving access to the whole table / Redaction; The database is designed to conform with data protection legislation; Data must not be kept for longer than it is required; Only relevant data must be collected and stored; Keeping the data secure from loss or damage e.g. backup.

Designing ethically means putting the students’ rights at the centre of every technical decision. The developers must ensure the database only collects information that is genuinely necessary (data minimisation), stores it securely with encryption and strict access controls, and gives parents the ability to review and correct their children’s records. Since many students are minors, extra care must be taken, and the system should automatically purge records after a legally defined retention period rather than holding onto them indefinitely.

Question 3

The ATHLETICS table contains information about athletics events.

(a) Outline one reason why databases are normalized.
(b) Outline why the data type for the Olympic Record attribute (OlymRec) cannot be an integer.
The table can also be represented as:
ATHLETICS
(Event, Type, SubType, Gender, OlymRec, WldRec)
(c) Construct the 2nd Normal Form (2NF) of the unnormalized ATHLETICS relation shown above.
(d) Outline why databases are normalized from 2nd normal form (2NF) to 3rd normal form (3NF).

Most-appropriate topic codes (IB Computer Science SL):

• Topic A.2 — The relantional database model (Parts (a), (b), (c), (d))

▶️ Answer/Explanation

(a)
For the correct answer (any one, outlined):
Reduces duplicated/Redundant data — which reduces wastage of storage space / reduces/eliminates data inconsistencies and improves query processing time. Improves data security/privacy — as more granular access control can be implemented on individual tables. Improves data integrity/consistency — as integrity constraints can be set to ensure changes follow allowed rules, eliminating update/insert/delete anomalies. Improves query performance / makes querying data easier — as data is stored in a structured manner.

One compelling reason to normalize is to eliminate update anomalies. Imagine if the Type “Track” was misspelled in one row — without normalization, you would have to hunt through every occurrence and fix it individually, risking inconsistency. By separating the event description into its own table, you store each piece of information exactly once, so a correction needs to be made in only one place.

(b)
For the correct answer:
The value in OlymRec has a decimal point / is not a whole number — so it would not be accurate; a float/double/real data type is required. The timing of athletes will vary in just milliseconds and so the minute will barely change, so a decimal point is required for the accurate measurement.

An Olympic record like $9.63$ seconds for the $100\text{m}$ sprint is a fractional value where the hundredths of a second are critically important. If you stored this as an integer, the record would be truncated to either $9$ or $10$, completely destroying the precision needed to distinguish between a gold-medal-winning time and a runner-up finish.

(c)
For the correct answer:
EVENTS (Event, Type, Subtype) and ATHLETICS (Event*, Gender, OlymRec, WldRec); OR EVENTS (EventID, Event, Type, Subtype) and ATHLETICS (EventID*, Gender, OlymRec, WldRec).

In 2NF, you identify that Type and Subtype depend only on the Event (partial dependency) and not on the combination of Event and Gender. So you split the table in two: an EVENTS table storing each event’s Type and Subtype once, keyed by Event (or an EventID surrogate), and an ATHLETICS table that keeps the Event (as a foreign key), Gender, OlymRec, and WldRec — eliminating the repetition of “Track / Run” for every row of the same event.

(d)
For the correct answer:
To remove transitive (functional)/non-key dependencies — where non-key attribute(s) depend on another non-key attribute; to reduce insert/update/delete anomalies and further reduce data redundancy.

Moving from 2NF to 3NF means hunting down transitive dependencies — situations where a non-key column depends on another non-key column rather than directly on the primary key. For instance, if SubType determined Type, that would be a transitive dependency. Eliminating these ensures you cannot end up with conflicting data (like two subtypes mapping to different types) and further slims down redundancy.

Question 4

A company designs new kitchens for customers. It has a shop that shows examples of the kitchen cabinets, sinks, wall tiles and floor tiles that can be included in the new kitchen.
When customers have chosen the items they would like for the new kitchen, a simulation is set up to show how these items would look.
(a) State three variables that could be used for this simulation.
(b) Outline two rules that would need to be applied for this simulation to be created within the constraints of the customer’s kitchen.
(c) Outline two factors that would impact on the reliability of this simulation.
(d) Discuss the advantages and disadvantages of using simulation to design a fitted kitchen.

Most-appropriate topic codes (IB Computer Science SL):

• Topic B.1 — The basic model (Parts (a), (b), (c))
• Topic B.2 — Simulations (Part (d))

▶️ Answer/Explanation

(a)
For the correct answer (any three):
NUMBER_OF_WALLS, WALL1_HEIGHT, WALL1_WIDTH, TILE_HEIGHT, TILE_WIDTH, etc. (accept ‘dimensions’ but not vague ‘size’).

To build a meaningful kitchen simulation, the software needs measurable inputs it can mathematically manipulate. Variables like the width and height of each wall let the system calculate how many tiles will fit, while the dimensions of individual cabinets determine whether they can be placed side by side along a given run of wall without overlapping or leaving awkward gaps.

(b)
For the correct answer (two rules, each outlined):
The cabinet widths when added together must be less than the wall width — in order for the cabinets to fit the available space. The width of the room and cabinet depths must be taken into account when fitting cabinets to opposite walls — to make sure there is enough room for a person to comfortably walk/work between them. (Accept other reasonable answers.)

One essential rule is a fit constraint: the sum of all cabinet widths placed along a wall must be strictly less than or equal to that wall’s total length, otherwise the simulation would allow an impossible design. Another critical rule concerns human ergonomics — if cabinets are placed on two opposing walls, the gap between them must exceed a minimum threshold (say $1.2\text{m}$) so that a person can actually stand at the counter, open drawers, and move around without feeling cramped.

(c)
For the correct answer (two factors, each outlined):
The accuracy of the measurements/shape of the room — if the room was not measured correctly, the resulting simulation will not be correct. The dimensions of the cabinets/tiles used in the simulation — the cabinets/tiles may prove to be too big for the available space. The quality of the representation of colours/designs/patterns of the products — may lead to customer disappointment. If it’s not a VR simulation — it may be difficult to visualize how much space there is around a person when the cabinets are fitted. (Do not accept hardware limitations.)

Reliability hinges on input fidelity and perceptual accuracy. If the surveyor’s laser measure was slightly off and recorded a wall as $3.2\text{m}$ when it is actually $3.15\text{m}$, a cabinet run that appeared to fit perfectly on screen might be physically impossible to install. Beyond measurements, the simulation’s colour rendering matters hugely — a tile that looks like a warm cream on a calibrated monitor might arrive as a cold beige in reality, undermining the entire design consultation.

(d)
For the correct answer (2 advantages, 2 disadvantages, 1 conclusion):
Advantages — Allows the customer to try out different designs until it meets their requirements; The simulation is a tool that may help to avoid making expensive mistakes / may help to ensure better customer satisfaction. Disadvantages — The simulation will take time to set up and it may not be timely enough to satisfy the customer; The simulation depends on measurements taken and it may not be accurate enough / the kitchen may not fit in the actual space; The usefulness depends on the hardware — it may take too much time to render. Conclusion — A balanced final judgement on whether simulation is worthwhile.

Simulation gives customers an extraordinary preview — they can swap cabinet colours, rearrange the layout, and instantly see the result without a single physical prototype being built, potentially saving thousands in costly mid-installation changes. However, the simulation is only as trustworthy as the measurements fed into it, and a low-resolution render on a slow computer might look nothing like the final $3\text{D}$ reality, risking disappointment. On balance, the benefits of visual experimentation far outweigh the drawbacks, provided the designer validates every critical dimension with a physical site survey.

Question 5

A real estate agent makes use of electronic brochures to send to potential house buyers. These brochures contain details of the properties, including sets of photographs of the rooms and the different views from the property.
(a) Outline the impact in terms of memory requirements on the potential house buyer’s device when viewing a brochure.
The real estate agent decides to improve their brochures by using animated ‘walk-throughs’.
(b) State the name of the process that relates the original photographs of the properties to the animated ‘walk-throughs’.
(c) Explain how ray tracing may be beneficial to the production of the real estate agent’s animations.
(d) Explain the ethical considerations for the use of animated ‘walk-throughs’ in the new brochures.

Most-appropriate topic codes (IB Computer Science SL):

• Topic B.3 — Visualization (Parts (a), (b), (c))
• Topic B.2 — Simulations (Part (d))

▶️ Answer/Explanation

(a)
For the correct answer:
The images provided may be high resolution; the device will therefore need to have sufficient amount of RAM so that images can load quickly / correctly. Standard memory requirements should be enough since only text and images are being displayed.

Electronic brochures packed with high-resolution property photographs can consume substantial memory when opened. Each uncompressed image might occupy several megabytes in RAM during viewing, so a device with limited memory could struggle, causing slow scrolling, lagging transitions, or even crashes if too many large images are rendered simultaneously on screen.

(b)
For the correct answer:
(3D) Rendering

The process that transforms a set of static photographs into a seamless animated walk-through is called rendering. It involves the computer calculating how each scene should appear from successive viewpoints, generating the intermediate frames that create the illusion of smoothly moving through the property.

(c)
For the correct answer:
Ray tracing renders the image by computing the path of light between the objects in the room and different viewpoints; allows the objects in the room to be simulated and shown as they would look from different positions and angles; allowing a more realistic experience while operating the virtual walk-through.

Ray tracing works by simulating the actual physics of light — tracing rays from a virtual camera through each pixel, bouncing them off surfaces, and calculating how they interact with materials. This means reflections on a polished wooden floor, soft shadows cast by furniture near a window, and the subtle way light diffuses through curtains can all be rendered with near-photographic accuracy, making the walk-through feel genuinely immersive rather than like a cartoon approximation.

(d)
For the correct answer (two considerations, each explained):
Privacy — the animation may accidentally capture images that the homeowner would rather were not made public. Security — the animation may show security flaws so that a potential burglar may enter the property. Reliability — the animation may make the property seem better than it is, so that the potential buyer may feel cheated. Anonymity — items in the animation may identify the seller of the property, which could be illegal / have negative impact on the seller.

One pressing ethical concern is privacy: a 360-degree walk-through might inadvertently reveal personal items like family photographs on the wall or sensitive documents on a desk, exposing details about the current occupants that they never consented to share publicly. Another is representational honesty — if the animation digitally enhances room sizes, hides structural defects, or uses unrealistic lighting to make a dark basement look bright and airy, the buyer is being deceived, which breaches ethical standards of fair dealing in real estate marketing.

Question 6

A supermarket has set up a spreadsheet model to compare its sales for each quarter during the financial year 2020 to 2021.
This model, for each of the eight departments, shows the:
  • quantity of units sold each quarter
  • average units sold per quarter
  • highest quarterly sales
  • lowest quarterly sales.
The manager of the supermarket plans to use this model in meetings with the eight department heads so that they can set targets for future sales.

(a) Identify the functions or formulas that could be used in the cells:

(i) F3 
(ii) G3 
(iii) H3 
(iv) I3
This model needs to be developed to set targets for increasing the sales over the next financial year for the bakery department. The target percentage increase can be changed within the model.
(b) Design a spreadsheet model that will calculate the target sales for the bakery department.
The model will display the updated sales targets for each quarter, the whole year and the average per quarter. The initial sales target is an increase of $7\%$.
(c) Describe one limitation of this model for predicting future profits.
The supermarket uses a second model to predict future sales increases based on previous performance. The spreadsheet in Figure 5 is part of that model. For the year 2020 to 2021, it shows the:
  • revenue for sales taken by each department in each quarter
  • cost of purchasing the stock for the supermarket
  • utility costs of running the store
  • staff costs.
All values have been rounded to the nearest dollar.
(d) Identify the formulas used in the cells:
(i) B30
(ii) B32
The names of the departments have been stored in a one-dimensional array, DEPARTMENT[]. It has been decided to use a number of parallel one-dimensional arrays to store the quarterly figures and the annual totals for each department.
(e) Construct the pseudocode required to enter the data for each department for each separate quarter, calculate the annual totals and store the data into suitably named arrays. [6]

Most-appropriate topic codes (IB Computer Science SL):

• Topic B.1 — The basic model (Parts (a), (b), (c), (d))
• Topic B.2 — Simulations (Part (e))

▶️ Answer/Explanation

(a)(i)
For the correct answer:
=SUM(B3:E3) // =B3+C3+D3+E3

Cell F3 sits in the “Whole year” column and needs to total all four quarterly sales figures for the Bakery row. The simplest approach is to use the SUM function across the range B3 to E3, which adds up the four quarterly values $9.4+10.2+14.7+10.2$ to give the annual total of $44.4$.

(a)(ii)
For the correct answer:
=AVERAGE(B3:E3) // =(B3+C3+D3+E3)/4 // =F3/4

The quarter average in G3 represents the mean sales per quarter. You can either directly average the four quarterly cells, or more elegantly divide the already-calculated whole-year figure in F3 by $4$, which yields $44.4 / 4 = 11.1$.

(a)(iii)
For the correct answer:
=MAX(B3:E3)

To identify the highest-performing quarter, the MAX function scans through cells B3 to E3 and returns the largest value. For Bakery, this would select $14.7$ from the Jan-Mar column.

(a)(iv)
For the correct answer:
=MIN(B3:E3)

Similarly, the MIN function picks out the smallest quarterly value, which for Bakery is $9.4$ from both the Jul-Sep and Apr-Jun quarters.

(b)
For the correct answer (5 marks):
Inclusion of $7$ or $1.07$ anywhere in the answer (either in a cell or in formula); Use of absolute cell reference for common target $\%$ in cell B16; Use correct formula in cell B16, either $=B3 * 1.07$ or $=B3 * (1 + \$C\$12 / 100)$; Use correctly adapted formulas in cells C16, D16, E16; Use correct formulas in F16 and G16, either similar to columns B-E or just directly calculating the sum and average on B16..E16.

The model should place the target percentage (7\%) in a separate cell with an absolute reference so it can be changed later without rewriting every formula. Then each quarter’s target is calculated by multiplying the original sales figure by $1.07$ (or equivalently by $(1 + \text{target\%}/100)$). The target whole-year and target average can then be computed from these four new target quarterly values using SUM and AVERAGE, mirroring the structure of the original data.

(c)
For the correct answer (any one):
The model only shows data for one year — it may not be an accurate basis for predicting next year’s outcome. The model assumes there is the same target $\%$ for each department — this may not be an accurate reflection of the way the business is developing. The future sales cannot be predicted based on a desired $\%$ increase — for a prediction you need trends over time.

One fundamental weakness is that the model extrapolates from a single year’s data. Sales figures for 2020–2021 might have been unusually high or low due to one-off events (like a pandemic lockdown boosting grocery sales or a supply chain disruption depressing deli figures). Using just one data point to forecast the future ignores seasonal trends, economic cycles, and competitor activity, making the $7\%$ target potentially unrealistic or, conversely, far too conservative.

(d)(i)
For the correct answer:
=B24+B28 / =B24+B26+B27 / =SUM(B24, B28)

Cell B30 sits in the “Total costs” row and must combine the wholesale costs subtotal (row 24) with the other costs subtotal (row 28). A straightforward addition formula $=B24+B28$ achieves this correctly.

(d)(ii)
For the correct answer:
=B13-B30 / =B13-B24-B28 / =B13-B24-B26-B27

Cell B32 represents the profit, which is defined as total revenues minus total costs. Looking at the spreadsheet layout, row 13 holds the total revenues and row 30 holds the total costs, so the formula $=B13-B30$ computes the difference, yielding the profit figure of $22.2$ for the Apr-Jun quarter.

(e)
For the correct answer (6 marks):
Use of a loop with correct parameters; Prompts to let user know for which department to enter data/which quarter; Appropriately named array for at least one quarter; All arrays appropriately named; Correct formula to add together annual data; All input/calculated data assigned to correct array elements.

loop COUNT from 0 to 7
    output "Enter data for ", DEPARTMENT[COUNT], " Department"
    output "Apr - Jun: "
    input APR_JUN[COUNT]
    output "Jul - Sep: "
    input JUL_SEP[COUNT]
    output "Oct - Dec: "
    input OCT_DEC[COUNT]
    output "Jan - Mar: "
    input JAN_MAR[COUNT]
    WHOLEYEAR[COUNT] = APR_JUN[COUNT] + JUL_SEP[COUNT] + OCT_DEC[COUNT] + JAN_MAR[COUNT]
end loop

The pseudocode uses a loop that iterates through each of the eight departments (index 0 to 7). For each iteration, it prompts the user with the department name, collects four quarterly sales values into parallel arrays, and then computes the annual total by summing those four inputs and storing the result in a separate WHOLEYEAR array at the same index position.

Question 7

The web browser shown in Figure 6 includes a feature that enables the user to inspect the source code.
(a) Outline why the URL in Figure 6 is a “Full URL”. 
(b) Sketch the output of the code in Figure 6. 
(c) Outline why the web page in Figure 6 is a static web page. 
The web page in Figure 6 uses Javascript elements.
(d) Explain why the support of client-side scripting languages is a key function of web browsers. 
(e) Distinguish between a protocol and a standard.
A user wants to access another website and enters its URL into the address bar. 
(f) Describe how the domain name service (DNS) enables the user to access the new site.
A user wishes to download a video resource from a web-based host to their smartphone. The site offers a lossy download option and lossless download option. It was recommended that the user uses the lossy compression option for this download.
(g) Explain why lossy compression is used in mobile computing. 

Most-appropriate topic codes (IB Computer Science SL):

• Topic C.1 — Creating the web (Parts (a), (b), (c), (d), (e), (f))
• Topic C.3 — Distributed approaches to the web (Part (g))

▶️ Answer/Explanation

(a)
For the correct answer:
The URL contains the protocol/scheme, domain, sub domain and Top-level domain (path) and the file resource (Query/fragment). For reference: Scheme/Protocol: https://, Subdomain: www, Domain Name: educationalsite.org, Top level domain: org, Path: /assets/home.html

A full URL leaves no ambiguity about how to reach the resource. It explicitly states the protocol (https://), the subdomain (www), the registered domain (educationalsite.org), the top-level domain (.org), and the exact file path (/assets/home.html) — every component the browser needs to locate and retrieve that specific page is present in the address.

(b)
For the correct answer:

The HTML renders a heading reading “Mother Tongue Languages” followed by a table with a single header row listing eight language columns. However, the table body contains no actual data rows, so below the header the table appears empty — just the column titles sitting above blank space, which would look incomplete to a viewer.

(c)
For the correct answer:
The web page does not display different content each time it is viewed; it doesn’t change dependent on either user input, time of day etc.; A static web page requires the source code to be rewritten/edited/modified to add new content; A static page is one where the construction of the page is controlled by the browser on the client device and does not link to an external data source; whereas a dynamic web page is linked to an external data source.

A static page is essentially a fixed document sitting on a server. Every visitor who navigates to that URL sees exactly the same HTML file — the content doesn’t adapt based on who is logged in, what time it is, or any database query. To change anything on the page, someone must physically edit the HTML source code and re-upload it, unlike dynamic pages that assemble content on the fly from databases or APIs.

(d)
For the correct answer:
Client-side scripting languages would include Javascript, jquery etc; These are commonly used languages that add functionality and interaction to pages; There is broad implementation of the languages and are accepted as a standard; Client-side languages are rendered in the browser rather than on the server; Failure to correctly support may reduce functionality, cause errors in appearance or interaction; Broad support of client-side scripting languages allows similar functionality and therefore similar browsing experience independent of browser type.

Modern web pages are not just static documents — they are interactive applications. Client-side scripting (primarily JavaScript) enables everything from form validation to animated menus to real-time content updates without reloading the page. Browsers must interpret and execute these scripts locally on the user’s device; without this capability, the vast majority of today’s websites would break, becoming inert, unresponsive shells of their intended selves.

(e)
For the correct answer:
A protocol is a set of rules which enable network communication that must be followed; A standard is a set of rules which have broad support and should be adhered to and provide a framework for development.

A protocol is a precise agreement on how data is formatted and transmitted — like HTTP defining that a request must start with a method (GET, POST) followed by headers and a body. A standard is a wider, often industry-backed specification that multiple protocols and technologies conform to — HTML5 is a standard because all major browser vendors have agreed to implement it, ensuring web pages render consistently regardless of which browser you use.

(f)
For the correct answer:
The DNS server or Domain Name Service server translates the Domain names into an IP Address; The browser first checks its own cache to see if it has a recent DNS record for the domain. If found, it uses this information to directly connect to the website’s IP address. If the required DNS record is not found locally, the DNS query is sent to the configured DNS resolver until the top-level Domain (TLD). Top-level domain servers are the ultimate authority for the domain and hold the master list of sites for the domain. Address resolution occurs in the application layer of TCP/IP. The DNS resolver sends the IP address back to the user’s web browser and information is displayed. If it is not found, an error message is sent back to the client’s browser.

DNS acts like the internet’s phonebook. When you type a human-readable address like “example.com” into your browser, the DNS system works through a hierarchy — first checking your local cache, then querying your ISP’s resolver, which may in turn ask the .com top-level domain server, ultimately returning the numerical IP address (like $93.184.216.34$) that your computer needs to actually connect to the correct server and load the website.

(g)
For the correct answer (up to 4 marks, from clusters):
File Size — Lossy compression allows the file size to be reduced more than lossless compression; this means it can be used to transfer information between devices more rapidly. Quality — Although there will be some reduction in picture quality or sound, the advantages of greater speed of transfer will outweigh the loss of image quality as the screen size of the mobile device may not be sufficiently large for the imperfections to be apparent. Cost — Lossy compression resulting in smaller file size would require less data usage, conserving bandwidth and reducing costs for mobile users.

On a mobile device, every megabyte counts. Lossy compression aggressively shrinks video files by discarding perceptual details the human eye barely notices — slight colour variations in shadow areas or audio frequencies outside normal hearing range. The result might be a $50\text{MB}$ video becoming $10\text{MB}$, downloading five times faster over a cellular connection, consuming a fraction of the user’s monthly data allowance, and still looking perfectly acceptable on a palm-sized smartphone screen where minor compression artefacts are invisible.

Question 8

While working on an assignment task for History of the Americas, Brooke enters a question into a search engine (Figure 7).
The search returns 173000 results in 0.043 seconds.
Another student indicated that Brooke would obtain better results using keywords rather than a search phrase.
(a) Outline why keywords would be used in a search rather than a phrase.
Web crawling indexes webpages in the search engine’s database (Figure 8). The two web crawling methods used are a breadth-first crawl and a depth-first crawl.
(b) In Figure 8, A has not been previously visited. State the first three webpages visited in a breadth-first search.
(c) Outline one reason why search engines use a breadth-first search.
As the web crawler traverses the pages in a website it collects data. This is used to form the metrics data for search rankings.
(d) Identify two features of the PageRank algorithm.
Many web developers attempt to optimize the search results for their site.
(e) Explain the impact for DP History students such as Brooke if the web developer uses black hat search engine optimization techniques.

Most-appropriate topic codes (IB Computer Science SL):

• Topic C.2 — Searching the web (Parts (a), (b), (c), (d), (e))

▶️ Answer/Explanation

(a)
For the correct answer:
Keywords are often matched with the metatag keywords and description as well as the page content increasing the accuracy of the search; Use of keywords helps to clarify the searcher’s thinking; Using a phrase adds additional terms that are either ignored by the search engine or could provide false positive results; Can reduce contextual differences derived from use of language in the results.

When Brooke types a full natural-language question like “What were the causes of the American Civil War?”, the search engine has to parse irrelevant filler words (“what”, “were”, “the”, “of”) that dilute the query’s focus. By reducing the query to keywords — “causes American Civil War” — every term carries analytical weight, matching more precisely against indexed page content and metadata, producing results that are more relevant and less cluttered by pages that happen to contain the phrase “what were the” in some unrelated context.

(b)
For the correct answer:
A-B-E

In a breadth-first crawl starting from node A, the crawler visits A first, then explores all of A’s immediate neighbours at the same depth level before going deeper. A links to B first, then to E (after processing B’s siblings or in order of discovery), so the sequence is A, then B, then E — covering the shallowest layer completely before descending.

(c)
For the correct answer (any one):
The breadth first crawl ensures that all pages at a certain depth are indexed before moving deeper, giving a thorough snapshot of the web. The breadth first crawl completes/visits all the links on the page before moving to the next page, whereas the depth first crawl follows the line of one series of links from page to page until it ends — potentially this means the initial pages may not be completely indexed if the stack is large. Prioritizes more important, higher-level pages, which are more likely to contain significant information and links to other relevant content.

Breadth-first crawling is like systematically mapping a city by first cataloguing every building on Main Street before venturing down any side alleys. This ensures the crawler builds a comprehensive index of the most prominent, highly-linked pages early in the crawl, avoiding the risk of getting lost down an infinitely deep chain of obscure, low-value pages and missing the important hub content altogether.

(d)
For the correct answer (any two):
Page rank is a numerical value that represents the importance of a page; A Dampening factor is applied to the calculation; Page rank algorithm is recursive; Page rank uses the page rank of incoming / outgoing links.

PageRank fundamentally treats links as votes of confidence. One key feature is its recursive nature — a page’s rank depends on the ranks of the pages linking to it, which in turn depend on their own incoming links, requiring iterative computation until values converge. Another feature is the dampening factor (typically around $0.85$), which models the probability that a random surfer will continue clicking links rather than jumping to an entirely new page, preventing rank from being trapped in circular link loops.

(e)
For the correct answer (up to 4 marks, with reference to DP History):
Information inaccuracy — As the Black hat SEO distorts search rankings (using keyword stuffing), it provides false/inaccurate information; the students may potentially have unintended access to inappropriate material; the validity of material is questionable using unethical techniques. Time and Resources — This may lead to wastage of bandwidth visiting unintended websites; the time wasted in visiting the site. Security — including exposure to malware and other security issues; potential damage from this.

For Brooke researching History of the Americas, black hat SEO poses a serious academic threat. A site using keyword stuffing might invisibly pack its pages with terms like “American Revolution primary sources,” tricking the search engine into ranking it highly even though its actual content is plagiarised, factually wrong, or completely unrelated — perhaps a shopping site selling revolutionary-war-themed merchandise. Brooke could waste precious research time sifting through these deceptive results, and worse, might unknowingly cite inaccurate information in her assignment, undermining her academic integrity and grade.

Question 9

ARPANET was developed as a project by the American military. It became the technical foundation for the internet. Figure 9 is a representation of ARPANET in 1974.
(a) Outline one reason why ARPANET was developed as a distributed network.
The original ARPANET used cable networks within the US. When linking to Hawaii and the United Kingdom it used a satellite link. The network consisted of connected mainframe computers hosting servers that had a number of connected terminals (clients).
(b) Outline one advantage of using a client-server architecture.
The nature of computing has evolved from client-server architecture to peer-2-peer and cloud computing.
(c) Compare peer-2-peer and cloud computing.
Decentralization of the web is partly a result of open standards, interoperability and distributed networks.
(d) To what extent have open standards and interoperability supported the decentralization of the web?

Most-appropriate topic codes (IB Computer Science SL):

• Topic C.3 — Distributed approaches to the web (Parts (a), (d))
• Topic C.4 — The evolving web (Parts (b), (c))

▶️ Answer/Explanation

(a)
For the correct answer (any one):
Connection and sharing between nodes — enabling better resource utilization / collaboration / distributed computing power / storage. Reliability of the network in the event of failure of one or more nodes (fault tolerance) — providing redundancy and replication to ensure data durability and availability. Distributed networks are more scalable, especially in scenarios where the network is distributed among large geographical areas — they leverage localized resources and reduce latency.

The military motivation was survivability. In a centralized network, destroying the single central hub would collapse all communication. ARPANET’s distributed design meant messages could dynamically reroute through any available path — if one node was knocked out (say, by a Soviet attack), traffic would simply flow around the damage through surviving nodes, keeping command and control operational.

(b)
For the correct answer (any one):
Has structured hierarchy/centralized control — the functionality of the network will be centrally controlled, giving more systematic and structured access. Enhanced security — as the server is responsible for security, it facilitates better implementation of monitoring. Enables management of resources, storage, applications and users leading to cost saving. Client devices don’t need to have as much processing power, memory, storage, etc. because the applications are kept and run on the server.

In a client-server architecture, the server acts as a gatekeeper. All critical data, authentication, and business logic reside on a central machine that can be secured, backed up, and patched by administrators. Thin client terminals — even old, low-powered machines — can access sophisticated applications because all the heavy computational lifting happens server-side, reducing hardware costs across the organisation.

(c)
For the correct answer (2 marks for similarities/contrasts, up to 4 marks):
In cloud computing resources are managed and provided by centralized data centres and accessed via the internet; it is often provided as a service (PaaS, SaaS, private cloud, community cloud, public cloud); it is generally scalable / adaptable / distributed. Peer to peer computing is distributed / decentralized; peers directly share resources such as files or processing power; level of access and control is generally set by the users; it is focused on sharing rather than supplying a specific service; there is equality between the peers in the network.

Cloud computing centralizes resources in vast, professionally managed data centres — when you use Google Drive, your files sit on Google’s servers, and you access them through a client. Peer-to-peer, in contrast, has no central authority: every participant is simultaneously a consumer and a provider, sharing their own disk space and bandwidth directly with others. Cloud offers reliability and convenience at the cost of control; P2P offers resilience and freedom from corporate oversight but depends on the goodwill of individual peers staying online.

(d)
For the correct answer (3 marks for open standards, 3 for interoperability):
Decentralization means that resources are distributed rather than localized ensuring that diverse systems and devices can communicate seamlessly. Open standards (HTTP, TCP/IP, HTML, etc) shifts the control from government (and corporate) control to a broad selection including individuals, like-minded groups and organizations; this enables freedom of expression, sharing, cooperation, communication and collaboration. Interoperability is a product or system whose interfaces are understood by and work with other systems; there are no restrictions enabling ease of processing, manipulation and transfer of data; interoperability enables information systems (particularly databases) to exchange information without significant modification or the use of third-party agents; interoperability is based on open standards.

Open standards have been the bedrock of web decentralisation. Because TCP/IP, HTTP, and HTML are publicly documented and freely implementable, no single company or government can gatekeep who builds a web server or browser. Anyone can create a website or a new web service that interoperates with everything else, which is why we have millions of independent sites rather than a single monopolistic platform. Interoperability amplifies this — your email from a tiny self-hosted server can still reach someone on Gmail because both follow the SMTP standard, meaning power remains distributed across countless independent operators rather than concentrated in a few walled gardens.

Question 10

A car rental company has offices in cities in Spain and Portugal. It manages its cars as a large, unsorted collection of rental objects that is accessed by a Java program.
The following UML diagram describes the current main Rental class. Fuel type and transmission type were chosen to be Boolean because they have two choices: petrol or diesel for fuel type, and manual or automatic for transmission type.
The brand and the model of the car are stored together as one string brandModel.
Typically the company has many cars of the same brand and model.
(a) Outline the general nature of an object.
(b) State one mutator method to be included in the class Rental.
(c) Construct the code for the accessor method getBrandModel().
(d) Outline one purpose of a default constructor.
The company is buying new electric cars and hybrid cars.
(e) Outline one change that needs to be made to class Rental due to this development (company buying new electric cars and hybrid cars).
Based on this Rental class, the program defines several other classes: Car, Bus and Van, each with their own characteristics. For example, the class Car adds the attribute numberOfDoors to the class Rental.
(f) State the relationship between Rental and Car.
(g) Construct the code for the class Car without having to duplicate all the attributes and methods from the class Rental. The default constructor of the class Rental should be overridden to also assign the value 4 to numberOfDoors. No other constructors are required.

Most-appropriate topic codes (IB Computer Science SL):

• Topic D.1 — Objects as a programming concept (Parts (a), (b), (c), (d), (e), (f), (g))
• Topic D.2 — Features of OOP (Part (g))
• Topic D.3  — Program development (Parts (b), (c), (d), (g))

▶️ Answer/Explanation

(a)
For the correct answer:
An object is an abstract entity; consists of data/attributes/properties; has methods/behaviour/actions on that data; An object occupies memory / has a lifecycle; An object is an instance of a class.

An object is a self-contained bundle that combines state and behaviour. Think of a Rental object — it holds concrete data like a specific number plate and price per day (its attributes), and it can perform actions like returning its brand model or updating its rental class (its methods). The object exists in memory as a real instance created from the blueprint defined by its class.

(b)
For the correct answer (any one):
setNumberPlate(String numberPlate); setPricePerDay(double pricePerDay); setRentalClass(char rentalClass); setYear(int year); setBrandModel(String brandModel); setFuelType(boolean fuelType); setTransmissionType(boolean transmissionType);

A mutator (or setter) method allows external code to safely modify an object’s private attribute. For example, setPricePerDay(double newPrice) would update the rental price while potentially including validation logic — ensuring the new price isn’t negative — before actually changing the internal variable.

(c)
For the correct answer (3 marks):
public method; return type; correct return.

public String getBrandModel() {
    return this.brandModel;
}

An accessor method grants read-only access to a private field. The method is declared public so external classes can call it, has a return type of String matching the brandModel field’s type, and simply returns the current value using the return keyword — the this keyword is optional but clarifies we’re referring to the instance variable.

(d)
For the correct answer:
A default constructor instantiates an object of a class with null or default values for the instance variables/attributes without using any parameter.

A default constructor provides a no-argument way to create an object when you don’t yet have all the specifics. It allocates memory for the new Rental object and initialises all fields to sensible defaults (numbers to 0, booleans to false, Strings to null), giving you a blank but valid object that can be populated later through mutator methods.

(e)
For the correct answer:
fuelType can no longer be boolean; but could be another datatype such as int/char/String (or similar) to represent the distinct values for 4 different types of fuel.

With four fuel options — petrol, diesel, electric, and hybrid — a simple true/false boolean is mathematically insufficient (2 states cannot represent 4 possibilities). The fuelType attribute must be changed to a wider data type: a String could hold “electric” or “hybrid”, an int could use codes (0=petrol, 1=diesel, 2=electric, 3=hybrid), or a char could use initials (‘P’, ‘D’, ‘E’, ‘H’).

(f)
For the correct answer:
Car inherits Rental (allow Car ‘is a’ Rental or Car extends Rental or Car is a subclass of Rental).

The Car class extends the Rental class, establishing an inheritance (“is-a”) relationship. A Car is a specific kind of Rental — it inherits all the generic rental properties like number plate and price per day, while adding its own specialised attribute (numberOfDoors) that is unique to cars but not relevant to buses or vans.

(g)
For the correct answer (4 marks):
Award [1] for (public) class Car extends Rental; Award [1] for declaring numberOfDoors; Award [1] for numberOfDoors being set to 4 within the constructor; Award [1] for correct getter / setter method.

public class Car extends Rental {
    private int numberOfDoors;

    public Car() {
        super();      // calls Rental's default constructor
        this.numberOfDoors = 4;
    }

    public int getNumberOfDoors() {
        return this.numberOfDoors;
    }

    public void setNumberOfDoors(int n) {
        this.numberOfDoors = n;
    }
}

By using the extends keyword, Car automatically inherits all of Rental’s fields and methods without re-declaring them. The overridden default constructor first calls super() to invoke the Rental constructor (which initialises the inherited attributes), then sets numberOfDoors to 4. The getter and setter provide controlled access to this new, Car-specific attribute, following the encapsulation principle.

Question 11

(a) Identify the OOP feature that was used to declare the Car class.
(b) Explain the benefits of the feature identified in part (a).
(c) Identify the two other features of OOP.
(d) Describe one advantage of modularity in program development.

Most-appropriate topic codes (IB Computer Science SL):

• Topic D.2 — Features of OOP (Parts (a), (b), (c), (d))

▶️ Answer/Explanation

(a)
For the correct answer:
Inheritance

Declaring Car using extends Rental is the classic syntax of inheritance — the defining OOP mechanism that allows a new class to absorb the properties and behaviours of an existing class.

(b)
For the correct answer (up to 3 marks):
Because the parent class holds common attributes and methods, inheritance will enhance reuse of code and reduce maintenance costs. Faster development time — as the existing code (base class) is already tested and less code needs to be written and debugged. Child classes may add new functionality (Extensibility) — extending the parent’s action and data without redefining them. Child class redefines the base class methods (Overriding) — to provide different functionality to existing method of the parent class. Easier to maintain — as the changes in the parent class are automatically reflected in the child class.

Inheritance means you write and test the generic rental logic once in the Rental class, and every specialised vehicle type — Car, Bus, Van — inherits it for free. If a bug is found in how the price per day is calculated, fixing it in Rental automatically cascades the correction to all child classes, eliminating the nightmare of hunting down and repairing duplicated code in multiple places.

(c)
For the correct answer (any two):
Encapsulation; Polymorphism; Abstraction

Beyond inheritance, OOP is defined by encapsulation (bundling data with the methods that operate on it and restricting direct external access), and polymorphism (the ability of different object types to respond to the same method call in their own specific way — for instance, both Car and Bus might have a calculateRentalCost() method that behaves differently).

(d)
For the correct answer (any one):
Easier / faster to debug/test — because there are far fewer mistakes in the smaller/individual modules. Speedier / faster completion of the project — because different teams work on different modules. Facilitates reusability of the code — as the existing modules can be reused across other modules. Improves code readability / organisation — smaller manageable modules leading to better logical organisation.

Modularity breaks a large, intimidating programming problem into small, independent chunks. When each module has a clear, narrow responsibility, you can test it in isolation — if the Car class works perfectly on its own, you know any integration bugs lie elsewhere. This compartmentalisation means multiple developers can work in parallel on different modules without stepping on each other’s toes, dramatically accelerating project completion.

Question 12

All Car objects have been read into a large unsorted array called allCars.
A method is needed to show customers the range of available cars.
This method should take the array allCars as a parameter and select Car objects from allCars so that every available brandModel is presented only once.
You may assume that there are never more than 100 different types of cars (as identified by the variable brandModel).
(a) Define the term parameter variable.
(b) Construct the code for the method findBrandModels() that will take the array allCars as a parameter. It must return a Car array that contains every brandModel that is available without duplication.
A customer wants to see which different types of cars are available. The criteria are it must be a petrol car with automatic transmission and cost less than 35 euros per day.
(c) Without writing code, outline the steps needed for a method to perform this query and present the results to the customer. 

Most-appropriate topic codes (IB Computer Science SL):

• Topic D.3  — Program development (Parts (a), (b), (c))

▶️ Answer/Explanation

(a)
For the correct answer:
The value/variable passed when the function/method is called; passed as a value or as a reference; is found in the parameter list of the method definition/signature.

A parameter variable is the named placeholder in a method’s signature that receives an argument when the method is invoked. When you call findBrandModels(allCars), the array reference allCars is the argument, and it gets bound to the parameter variable declared inside the parentheses of the method definition, making the passed data accessible by that local name within the method body.

(b)
For the correct answer (8 marks):
Correct method signature (excluding return type); instantiating a Car array (result) of size 100; loop through allCars with length condition; setting and resetting a variable (found or similar) inside the outer loop; loop that checks uniqueness; checking for a null pointer exception in at least one loop; correct test (use of equals() and ‘==’); correctly adding the Car when not found in result; returning the correct result array.

public Car[] findBrandModels(Car[] allCars) {
    Car[] result = new Car[100];
    int count = 0;
    for (int i = 0; i < allCars.length && allCars[i] != null; i++) {
        boolean found = false;
        for (int j = 0; j < count; j++) {
            if (result[j] != null && result[j].getBrandModel().equals(allCars[i].getBrandModel())) {
                found = true;
                break;
            }
        }
        if (!found) {
            result[count] = allCars[i];
            count++;
        }
    }
    return result;
}

The algorithm uses a nested loop structure. The outer loop walks through every Car in allCars; for each one, the inner loop scans the result array to see if a Car with the same brandModel has already been stored. The found boolean flag tracks whether a match is discovered. Only if found remains false after the inner scan is the current Car added to result at the count position, ensuring the returned array contains each distinct brandModel exactly once, in the order they were first encountered.

(c)
For the correct answer (5 marks, outline only):
Create a result array to store the return value of findBrandModels(); iterate through the result array to check individual Car object; if the Car object does not fulfil all three criteria then remove this Car object from result / make this Car object null; iterate (or sort/search) through the result array to output the Car objects that are not null / return the result array. OR: Create a desiredCars array; iterate through the result of findBrandModels(); if an object fulfils the three conditions, copy the car object into desiredCars; output / return the desiredCars array.

You would first call findBrandModels() to get the de-duplicated list of all available car types. Then, loop through that result array and, for each non-null entry, check three boolean conditions: is the fuel type petrol (getFuelType() == true assuming true means petrol), is the transmission automatic (getTransmissionType() == true), and is the daily price less than 35 euros (getPricePerDay() < 35.0). Cars that pass all three tests are either added to a new filtered array or printed directly to the customer. Finally, present this filtered list — perhaps by iterating through it and outputting each matching brandModel and its price.

Question 13

The car rental company also has a database of customers. For each customer it stores an object with personal data such as their ID, name and address.
This Customer object includes the history of the cars they have rented and the car they are currently renting (if any).
(a) Draw the relationship between Customer and Car objects.
A suggestion has been made to modify the Rental class to include customerID.
The intention is to make it easier to find the customer who has a certain car.
(b) Describe in terms of dependencies why the suggestion to modify the Rental class to include customerID is inappropriate.
(c) Explain the ethical obligations for programmers when developing a customer database.

Most-appropriate topic codes (IB Computer Science SL):

• Topic D.1 — Objects as a programming concept (Parts (a), (b))
• Topic D.3  — Program development (Part (c))

▶️ Answer/Explanation

(a)
For the correct answer:

The UML-style line should connect Customer to Car with an aggregation or association arrow. Typically, a Customer object contains a collection (like a CarList) of Car objects representing rental history, plus an optional reference to a single Car for the current rental — this is a “has-a” relationship where the Customer holds references to multiple Cars over time.

(b)
For the correct answer:
The problem becomes that Car ‘has a’ Customer and Customer ‘has a’ car; This is a circular / duplicate / redundant relationship which may cause inconsistencies; It increases dependencies and causes more overhead when changes need to be made.

Adding customerID to Rental would create a circular dependency: Customer already maintains a list of Cars they’ve rented, so if Car also stores a reference back to Customer, you have two independent sources of “truth” about the same relationship. If a customer returns a car and the customerID in the Car object isn’t cleared simultaneously with its removal from the Customer’s history, the database enters an inconsistent state — a classic maintenance nightmare that violates the principle of keeping dependencies unidirectional and minimal.

(c)
For the correct answer (5 marks):
Obligation to respect privacy — only relevant data should be stored that helps the customer, to limit the impact on privacy. Obligation to provide data security — programmers should incorporate safeguards such as encryption, to limit the chance of personal data being misused. Obligation to protect data against corruption — programmers should incorporate data validation / verification, to limit the chance of incorrect personal data being stored. (Accept any other valid obligation.)

Programmers handling customer databases shoulder serious ethical weight. They must design systems that collect only necessary data (minimisation) — storing medical information or political affiliations when all you need is a driving licence number is indefensible. They are obligated to implement robust security: encrypting stored personal data, hashing passwords, and using parameterised queries to prevent SQL injection attacks that could leak entire customer tables. Furthermore, they must ensure data accuracy through validation rules and provide mechanisms for customers to access and correct their own records, honouring the principle that people have a right to control information about themselves.

Scroll to Top