Monday, March 20, 2017

Relational Databases

So there's data that we want to store, access, and manipulate. Relational databases are tools that enable us to do just that.

With relational databases, you have to define the structure of the data before you can do anything with it. This structure is expressed as tables that have columns. And those columns represent a specific type of data (numerical, date, text, etc).

Tables are the types of things that you have information about. Columns are information about those types of things. Rows are information about the actual things. For example, if I'm interested in working with medical data and I need a database of patient (a type of thing) information, I might have a table called patients and the patients have information like name, and social security number, and insurance.

The blueprint of a database, which is just a set of table definitions, is called a schema.

Here's an example:

Doctor (
   name CHAR(20)
   id PRIMARY KEY
)

Patient (
   name CHAR(20)
   social_security_number INTEGER(9) PRIMARY KEY
   doctor_id FOREIGN KEY REFERENCES doctor
)

Tables can also have keys which are a special type of information that is unique to every row in the table. In the case of patients, that might be their social security number since no two people will have the same social security number.

Using SQL to manipulate data in a relational DB

Now for this database to be us useful, it needs to contain data. We can create a db and manipulate data in that db by using SQL (structured query language) which is a language that you can use to define what you want to do with that data that the database understands.

> CREATE DATABASE example_hospital;

Adding a doctor

> INSERT INTO Doctor VALUES("Dr. Phil")

Get all the doctors!

> SELECT * FROM Doctor

Get all the names of all of the patients of Doctor with name "Dr. Phil" (this involves data from more than one table!)
To select data in multiple tables that are related, we have to join them by running JOIN statements.

> SELECT patient.name FROM Patient, Doctor WHERE patient.doctor_id = doctor.id AND doctor.name="Dr. Phil"

This is known as the inner join. There are no rows with a key value in a table that does not match up with the key value of of another row in the corresponding table. In other words, it excludes rows from both tables that do not link up.

Left outer - retain all the rows in the left table but exclude rows on the right if they don't match. The values in those rows are replace with the value NULL. This would include all the patients even if they don't have a doctor seeing them.

Right outer - retain all the rows in the right table. This would include all the doctors even if they don't have patients.

Full outer - retain rows in both tables! This would include all the patients and doctors.

Get the number of patients

> SELECT count(*) from Patient

Get the number of patients per doctor!

> SELECT doctor.name, count(patient.id) as num_patients FROM Patient, Doctor WHERE patient.doctor_id = doctor.id AND doctor.name="Dr. Phil" GROUP BY doctor.name

Maintaining Data Consistency

Databases also prevent you from trying to do things that it thinks is nonsensical. For example, if you say that a patient has a doctor and you insert a bunch of patients that reference doctors. Then you can't just delete doctors because then there will be foreign keys in the patients table that point to nothing. That violates referential integrity (which says that every row in a table with a foreign key must have that key point to an actual row in another table).

DB's also support features like transactions, where you can specify a series of operations that are treated as atomic. The changes only persist if every operation succeeds. Otherwise, no changes persist.


1 comment:

  1. I was diagnosed as HEPATITIS B carrier in 2013 with fibrosis of the
    liver already present. I started on antiviral medications which
    reduced the viral load initially. After a couple of years the virus
    became resistant. I started on HEPATITIS B Herbal treatment from
    ULTIMATE LIFE CLINIC (www.ultimatelifeclinic.com) in March, 2020. Their
    treatment totally reversed the virus. I did another blood test after
    the 6 months long treatment and tested negative to the virus. Amazing
    treatment! This treatment is a breakthrough for all HBV carriers.

    ReplyDelete