What is database indexing and how it works

What is database indexing and how it works

Sep 28, 2020

Let's say you have a dictionary book containing hundreds of thousands of words. Now, you may need to find the meaning of a word, say 'tenebrous'. What will you do? Will you search from the first page until you find tenebrous, or will you lookup for 't' index in your dictionary book and skip all other words to find the meaning of 'tenebrous'.

Indexing in the database uses the same concept as indexing in a dictionary book.

Suppose you have a table called Persons in your database, which has millions of rows containing First and Last Names of millions of Persons. For instance, here's an illustration of a similar table:

idfirst_namelast_name
1Ahmed AhmedKhan
2SamCurran
3SeanJordan
4JenyStokes
5RobertClarke
6PatrickCummins
.......
1000000LukeWright

Now, without indexing, when you perform a query to search for the person with the first name 'Sean' then your SQL query will look up all the 1 million rows to perform this simple operation. Now, think about how expensive such a simple query can be when data scales to ten million rows. This will be even more expensive when you join another table with another million records.

Hence, to tackle this issue, the database index comes to help. What indexing will do is sets up the column in sorted order to help in optimizing query performance and searching in a large record.

For example, if you create an index for the first_name from the above Persons table, which can be done by following SQL query:

create index person_index on persons("first_name");

Then, it will create a separate index named person_index for you in your database and store first_name column of the Persons table in sorted order. Similarly, along with the first_name, the index will also store the memory address of its corresponding column, so that to lookup for the data from other columns of the same row when the search operation is performed.

Now when you search for a person name 'Sean' in a Persons table, then it will only lookup for the words matching 'S' and will skip other words, just like a human being will search in a dictionary. With this, unnecessary searches can be skipped and the database performance can be boosted.

I hope this helps somebody. Thank you.