Index
Last updated
Was this helpful?
Last updated
Was this helpful?
If there are lots of row on database table, it is time-wasting to get the value from the row, and scan the value of row one by one
Index is set on primary key by default
When user needs to search the result frequency by entering some parameters or fields, we need to create the index that we will searching on
A clustered index
is collocated with the data in the same table space or same disk file. It will be created when primary key is created on the column
Data blocks are all the time moved around here & there by the OS whenever it’s necessary. A database system does not have any absolute control over how physical data space is managed, but inside a data block, records can be stored or managed in the logical order of the index key.
Don’t need to care about actually organizing the physical record in a certain order, rather a small index section is maintained in that order & fetching or maintaining records becomes very easy.
An index is usually maintained as a B+ Tree on disk & in-memory, and any index is stored in blocks on disk.
The leaf index block of the index contains a row locator. For the primary index, the row locator refers to virtual address of the corresponding physical location of the data blocks on disk where rows reside being sorted as per the index key.
Any index other than a clustered index is called a secondary index. Secondary indices does not impact physical storage locations unlike primary indices.
Secondary indices are needed on these columns if the frequency of such queries is very high.
Secondary index is also maintained in the B+ Tree and it’s sorted as per the key on which the index was created. The leaf nodes contain a copy of the key of the corresponding data in the primary index.
Retrieving data through the secondary index means you have to traverse two B+ trees — one is the secondary index B+ tree itself, and the other is the primary index B+ tree.
if you delete a primary index, all secondary indices have to be updated to contain a copy of the new primary index key
Like primary keys, unique keys can also identify records uniquely with one difference — the unique key column can contain null
values.
Let’s say we have an index defined on 4 columns — col1
, col2
, col3
, col4
. With a composite index, we have search capability on col1
, (col1, col2)
, (col1, col2, col3)
, (col1, col2, col3, col4)
we can’t omit a column from the middle & use that like — (col1, col3)
or (col1, col2, col4)
or col3
or col4
etc
While deciding the columns for a composite index, you can analyze different use cases of your system & try to come up with the order of columns that will benefit most of your use cases.
The column name
can contain large values of any length. Also in the index, the row locators’ or row pointers’ metadata have their own size.
it’s possible to create an index on the first few bytes of data as well. Example: the following command creates an index on the first 4 bytes of name. Though this method reduces memory overhead by a certain amount
It refers that the uniqueness of the result get by searching by the index. Fewer results get by searching the index and the bigger difference between each result means higher cardinality.
Setting low cardinality index, such as gender will decrease the performance of the query, as from the concept of b-tree geometry, index containing numbers of value will damage the performance, and the time complexity will be near to O(n)
At the opposite side, setting high cardinality index, the time complexity will be near to O(1), as the index is nearly unique.
The query optimizer will decide whether using index or not on searching based on cardinality
Only one index per table per query will be used by optimizer based on cardinality
Help us to increase the speed of searching data
With DML
operations, indices are updated, so write operations are quite costly with indexes. The more indices you have, the greater the cost. Indexes are used to make read operations faster.
indices consume extra memory