Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Introduction

Data modeling is an important part of designing databases; it serves as a blueprint for understanding and structuring data in a database system. A data model describes the database in terms of data, relationships, semantics, and constraints.

One such model is a Conceptual Data Model (CDM) which is the first step of database design.

In this tutorial, we’ll explore an overview of what conceptual data modeling is, highlight major components that constitute it, and explain its significance in database design.

2. Understanding a Conceptual Data Model

At the early stage of database design, we use a conceptual data model to define the high-level structure of the database. The CDM gives the basis on which requirements we should gather prior to designing the database.

Simply put, a conceptual data model describes the operations of the business regarding its data. Hence, a well-designed CDM effectively represents the database in the real-world context. A conceptual data model primarily focuses on:

  • Entities: These are the key objects of concern or concepts relevant to a domain. For instance, entities can include “Student,” “Course,” or “Exam”
  • Attributes: These are the properties that describe an entity. An attribute of the “Student” entity can be the “student’s name” or “birth date”
  • Relationships: This is the connection between different entities, such as the relationship between the “Student” and “Exam”

Thus, a CDM aims to frame data in a way that’s accessible to all stakeholders. In this way, non-technical stakeholders can contribute towards making informed decisions.

An example of such a representation is the entity relationship diagram (ERD), which we’ll dive into shortly.

2.1. Benefits of a CDM

A conceptual data model is significant in database design. Here are some of the reasons for constructing a CDM in SQL database design:

  • Early Issue Identification: helps to detect inconsistencies at the early stages, which reduces the need to redesign in the future
  • Project Overview: captures the key business entities, attributes, and relationships between the entities
  • Simplifying Complex Domains: helps businesses and designers tackle data complexity common to large domains by subdividing them into smaller, manageable components
  • Project Roadmap: serves as the roadmap for designing the logical and physical model afterward
  • Goal-focused: focuses on the goal of each business stakeholder. Although, it can be applied across businesses in the same domain with few modifications required

3. Defining a Conceptual Data Model

Of the three main modeling levels, which are conceptual, logical, and physical, the conceptual model is the first step that allows for the greatest input from the largest sets of stakeholders.

The most common tools for developing a conceptual model are ERD and the Natural Language Information Analysis Method (NIAM).

Regardless of which we use, conceptual data modeling takes several structured steps to provide an adequate representation of the domain:

  • Identify the Scope: Clearly define the elements within the system and identify the key data that need to be managed. Prioritize high-level perspectives that align with the business domain
  • Define Entities and Attributes: Enumerate and describe the major entities in the system and their significant attributes.
  • Establish the Relationships: Describe how the entities interact with each other using terms like enroll, borrow, own, or manage
  • Validate the Model: Review the model to ensure it accurately represents the domain and make modifications where necessary
  • Document Assumptions and Rules: Clearly document any assumptions, business rules, or constraints associated with the model to avoid making it too complex

After the above steps, use visual tools like ERD to create a visual representation of the model. The CDM design process is iterative, therefore, it’s important to continuously engage business stakeholders, domain experts, and the technical team to ensure that the CDM is a true representation of the business model.

4. Understanding Entity Relationship Diagrams

As specified earlier, an ERD is one of the tools that can be used to visually represent a CDM. It’s a visual construction of how disparate entities relate to one another within a database system. The diagram below represents the components of an ERD.

The components of an ERD model diagram. The shapes are used to represent the entity attribute and relationships. The lines are used to connect the entities to each other based on their relationships. Each line represents a different kind of relationship
Fig. 1. Components of an entity relationship diagram.

The above diagram shows how we visually represent each component of the entity relationship diagram”

  • The rectangle shape is used to represent an entity, such as “course”
  • The oval shape represents an entity’s attributes, such as “course title” and “course code.” Attributes, however, are optional when developing the conceptual model
  • The diamond shape represents the relationship between the different entities
  • The line connects the entities

5. Modeling a University Using an ERD

Now let us consider a simple scenario to design an ERD for a university.

A university consists of multiple departments operating in a particular field of study, like Computer Science, Business Administration, and Engineering. Each department offers several programs that students choose to earn credits towards their degree.

To assess students’ performance, the university conducts exams for each course, and students receive grades based on their performance. Let’s apply the steps from before to model our university database.

5.1. Identify the Scope

Each academic department at the university is responsible for offering and managing many courses. Each has a name and a unique code identifier.

Students can register for different courses each semester. Most courses have specific credit units allocated to them, which count towards the student’s graduation. Each student also has a student ID, name, email, birthdate, and enrollment and graduation dates in the system.

At the end of each semester, students write exams for each course. Taking the exams is mandatory since it demonstrates an understanding of each course. The university schedules all exams on certain dates and assigns grades to students after evaluating their performances. At the end of an exam, students receive their grades, indicating the extent of their comprehension.

5.2. Identify Entities and Attributes

It’s vital to identify the entities and attributes of the model. Attributes are optional at the CDM level. Now the four major entities in the university model are:

  • Student: Indicates the learners
  • Department: Academic divisions providing such courses
  • Course: Concentrated subjects provided by departments
  • Exam: Evaluations of students in courses

The figure below shows how we represent the entities:

The entities of the university database.
Fig. 2. Entities of the university database.

5.3. Establish Relationships Between the Entities

Now let’s outline the relationships between the entities.

  • Student -> Course
    • A student can enroll in multiple courses
    • A course can have multiple students enrolled
    • This captures a many-to-many relationship in the enrollment entity
  • Department -> Course
    • A department offers many courses.
    • Each course belongs to one department only (one-to-many relationship)
  • Student -> Exam
    • A student can write various exams
    • Multiple students write an exam
    • This reflects the many-to-many relationship between students and exams
  • Exam -> Course
    • An exam belongs to a particular course
    • A course may have only one exam
    • This completes the one-to-one relationship between the exam and the course

After establishing the relationships, we then update our previous diagram using the connectors that best describe the relationship between the entities. The diagram below outlines the conceptual model diagram for the above university scenario:

An entity relationship diagram representing a conceptual data model. It represents the entities in a university database and how they relate to each other.
Fig. 3. Conceptual model for a simple university database.

5.4. Document Assumptions and Rules

To further communicate what our model represents, we can document what limitations, assumptions, and rules we considered for the entities and relationships.

For this model, we have:

  1. University Structure: Each department is affiliated with one university. Thus, there are no inter-university collaborations. A course is offered and managed by only one department.
  2. Student Enrollment and Course Registration: Students may have several courses for a semester, but no specification was mentioned as to the maximum possible number of courses under which a student can be enrolled. Courses may have prerequisites that have not been modeled in this ERD. A student must be present for at least one course at any time.
  3. Examinations and Grading: Each course has exactly one exam per semester. Exams are compulsory for students attending a particular course. The grading model for student’s exam performance is not defined. We assumed that retaken tests would not be part of the initial model.
  4. Entities and Their Relationships: The model does not include faculty members, instructors, and administrative staff. The database will not represent the history concerning earlier semesters or courses that the students have taken.
  5. System Constraints: Every student has an ID. Every department has a code. Every course has a unique identification for each department.

6. Conclusion

In this article, we learned about CDM, the modeling process, the tools used for data modeling, and the importance of CDM in database design.

The conceptual data model is an important foundation for designing the best functioning database system. It encourages input from both business expectations and technical teams, encouraging connectivity and clarity, and supporting the model to fulfill the business goal.