Previously in database basics:
1. SELECT, INSERT, CREATE and a REPL
2. binary expressions and WHERE filters
Next in database basics:
4. a database/sql driver
In this post, we extend gosql
to support indexes. We focus on the addition of PRIMARY
KEY
constraints on table creation and some easy optimizations
during SELECT
statements.
$ go run cmd/main.go
Welcome to gosql.
# CREATE TABLE users (id INT PRIMARY KEY, name TEXT, age INT);
ok
# \d users
Table "users"
Column | Type | Nullable
---------+---------+-----------
id | integer | not null
name | text |
age | integer |
Indexes:
"users_pkey" PRIMARY KEY, rbtree ("id")
This post will broadly be a discussion of this commit.
What is an index?
An index is a mapping of a value to a row in a table. The value is
often a column, but it can be many kinds of expressions. Databases
typically store indexes in tree structures that provide O(log(n))
lookup time. When SELECT
ing and filtering on a column
that is indexed, a database can greatly improve lookup time by
filtering first on this index. Without an index, a database must do a
linear scan for matching rows. Though sometimes if a condition is
broad enough, even with an index, a database may still end up doing a
linear scan.
While it may make sense initially to map a value to a row using a hash
table for constant lookup times, hash tables don't provide
ordering. So this would prevent an index from being applicable on
anything but equality checks. For example, SELECT x FROM y WHERE
x > 2
couldn't use a hash index on x
.
Indexes in many SQL databases default to a B-Tree, which offers efficient ordering of elements. These indexes are thus not constant-time lookups even if filtering on a unique column for a single item. Some databases, like PostgreSQL, allow you to use a hash-based index instead of a tree. Here the previously listed restrictions apply (i.e. only equality checks will use the index).
Upgrading gosql
We proceed as follows:
- Upgrade table creation to support specifying a primary key
- Pick a tree data structure for the index, adding it to the table
- Upgrade
INSERT
s to let any indexes on the table process the new row - Upgrade
SELECT
s to make use of any indexes, if possible
Upgrading table creation
To allow the specification of a single column as the primary key when creating a table, we have to first modify the lexer and parser.
Lexing/parsing
Since we've covered this process a few times already suffice it so say we make the following key additions:
- Add
PRIMARY KEY
as a new keyword token to the lexer - Add a check for this token to the parsing of column definitions
- Modify the AST to store a boolean value whether a column is a primary key
In-memory backend
Next we move on to handling a primary key during table creation.
Since there are many existing papers and blogs on implementing tree data structures, we will import an open-source implementation. And while most databases use a B-Tree, the most important properties of the tree for our purposes are 1) efficient ordering and 2) optionally duplicate keys. We go with a Red-Black Tree, GoLLRB.
The full definition of an index now includes:
- A name
- An expression (at first we only support this being an identifier referring to a column)
- A unique flag
- A type name (it will just be
rbtree
for now) - A primary key flag (so we know to apply null checks among other things)
- And the actual tree itself
type index struct {
name string
exp expression
unique bool
primaryKey bool
tree *llrb.LLRB
typ string
}
When we create a table, we add an index if one of the columns is a
primary key. We call out to a new public
method, CreateIndex
, that will handle actually setting
things up.
func (mb *MemoryBackend) CreateTable(crt *CreateTableStatement) error {
if _, ok := mb.tables[crt.name.value]; ok {
return ErrTableAlreadyExists
}
t := createTable()
t.name = crt.name.value
mb.tables[t.name] = t
if crt.cols == nil {
return nil
}
var primaryKey *expression = nil
for _, col := range *crt.cols {
t.columns = append(t.columns, col.name.value)
var dt ColumnType
switch col.datatype.value {
case "int":
dt = IntType
case "text":
dt = TextType
case "boolean":
dt = BoolType
default:
delete(mb.tables, t.name)
return ErrInvalidDatatype
}
if col.primaryKey {
if primaryKey != nil {
delete(mb.tables, t.name)
return ErrPrimaryKeyAlreadyExists
}
primaryKey = &expression{
literal: &col.name,
kind: literalKind,
}
}
t.columnTypes = append(t.columnTypes, dt)
}
if primaryKey != nil {
err := mb.CreateIndex(&CreateIndexStatement{
table: crt.name,
name: token{value: t.name + "_pkey"},
unique: true,
primaryKey: true,
exp: *primaryKey,
})
if err != nil {
delete(mb.tables, t.name)
return err
}
}
return nil
}
Implementing CreateIndex
is just a matter of adding a new
index to the table.
func (mb *MemoryBackend) CreateIndex(ci *CreateIndexStatement) error {
table, ok := mb.tables[ci.table.value]
if !ok {
return ErrTableDoesNotExist
}
for _, index := range table.indexes {
if index.name == ci.name.value {
return ErrIndexAlreadyExists
}
}
index := &index{
exp: ci.exp,
unique: ci.unique,
primaryKey: ci.primaryKey,
name: ci.name.value,
tree: llrb.New(),
typ: "rbtree",
}
table.indexes = append(table.indexes, index)
return nil
}
And that's it for creation of tables and indexes! Table creation is also the last time we need to make changes to the gosql frontend. The rest of the changes simply wrap existing insertion and selection.
Upgrading INSERT
When a row is inserted into a table, each index on that table needs to process the row so it can add value-to-row mappings to the index.
In the project code, you'll notice logic in CreateIndex
to also go back over all existing rows to add them to the new index.
This post omits further discussing the case where an index is
created after a table is created. After reading this post, that case
should be easy to follow.
Adding a row to an index is a matter of evaluting the index expression against that row and storing the resulting value in the tree. Along with the value, we store the integer index of the row in the table.
If the index is required to be unique, we first check that the value does not yet exist.
func (i *index) addRow(t *table, rowIndex uint) error {
indexValue, _, _, err := t.evaluateCell(rowIndex, i.exp)
if err != nil {
return err
}
if indexValue == nil {
return ErrViolatesNotNullConstraint
}
if i.unique && i.tree.Has(treeItem{value: indexValue}) {
return ErrViolatesUniqueConstraint
}
i.tree.InsertNoReplace(treeItem{
value: indexValue,
index: rowIndex,
})
return nil
}
And that's it for insertion!
Upgrading SELECT
Until now, the logic for selecting rows from a table is to pick the
table and iterate over all rows. If the row does not match
the WHERE
filter, we pass the row.
If the table has an index and we are using the index in a recognized
pattern in the WHERE
AST (more on that later), we can
pre-filter the table based on the index before iterating over each
row. We can do this for each index and for each time a recognized
pattern shows up.
This process is called query planning. We build a simplified
version of what you may see in SQL databases, specifically focusing
on index usage since we don't yet support JOIN
s. For
further reading, SQLite has
an excellent
document on their query planner for index usage.
func (mb *MemoryBackend) Select(slct *SelectStatement) (*Results, error) {
t := createTable()
if slct.from != nil {
var ok bool
t, ok = mb.tables[slct.from.value]
if !ok {
return nil, ErrTableDoesNotExist
}
}
if slct.item == nil || len(*slct.item) == 0 {
return &Results{}, nil
}
results := [][]Cell{}
columns := []ResultColumn{}
if slct.from == nil {
t = createTable()
t.rows = [][]memoryCell{{}}
}
for _, iAndE := range t.getApplicableIndexes(slct.where) {
index := iAndE.i
exp := iAndE.e
t = index.newTableFromSubset(t, exp)
}
for i := range t.rows {
result := []Cell{}
isFirstRow := len(results) == 0
if slct.where != nil {
val, _, _, err := t.evaluateCell(uint(i), *slct.where)
if err != nil {
return nil, err
}
if !*val.AsBool() {
continue
}
}
for _, col := range finalItems {
value, columnName, columnType, err := t.evaluateCell(uint(i), *col.exp)
if err != nil {
return nil, err
}
if isFirstRow {
columns = append(columns, ResultColumn{
Type: columnType,
Name: columnName,
})
}
result = append(result, value)
}
results = append(results, result)
}
return &Results{
Columns: columns,
Rows: results,
}, nil
}
It's very simple and easy to miss, here is the change called out:
for _, iAndE := range t.getApplicableIndexes(slct.where) {
index := iAndE.i
exp := iAndE.e
t = index.newTableFromSubset(t, exp)
}
getApplicableIndexes
There are probably a few very simple patterns we could look for, but
for now we look for boolean expressions joined by AND
that contain an index expression.
func (t *table) getApplicableIndexes(where *expression) []indexAndExpression {
var linearizeExpressions func(where *expression, exps []expression) []expression
linearizeExpressions = func(where *expression, exps []expression) []expression {
if where == nil || where.kind != binaryKind {
return exps
}
if where.binary.op.value == string(orKeyword) {
return exps
}
if where.binary.op.value == string(andKeyword) {
exps := linearizeExpressions(&where.binary.a, exps)
return linearizeExpressions(&where.binary.b, exps)
}
return append(exps, *where)
}
exps := linearizeExpressions(where, []expression{})
iAndE := []indexAndExpression{}
for _, exp := range exps {
for _, index := range t.indexes {
if index.applicableValue(exp) != nil {
iAndE = append(iAndE, indexAndExpression{
i: index,
e: exp,
})
}
}
}
return iAndE
}
More specifically though, within binary operations we only support matching on an index if the following three conditions are met:
- the operator is one of
=
,<>
,>
,<
,>=
, or<=
- one of the operands is an identifier literal that matches the index's
exp
value - the other operand is a literal value
This is a simpler, stricter matching of an index than PostgreSQL where you can index expressions more generally, not just identifer literals.
func (i *index) applicableValue(exp expression) *expression {
if exp.kind != binaryKind {
return nil
}
be := exp.binary
// Find the column and the value in the boolean expression
columnExp := be.a
valueExp := be.b
if columnExp.generateCode() != i.exp.generateCode() {
columnExp = be.b
valueExp = be.a
}
// Neither side is applicable, return nil
if columnExp.generateCode() != i.exp.generateCode() {
return nil
}
supportedChecks := []symbol{eqSymbol, neqSymbol, gtSymbol, gteSymbol, ltSymbol, lteSymbol}
supported := false
for _, sym := range supportedChecks {
if string(sym) == be.op.value {
supported = true
break
}
}
if !supported {
return nil
}
if valueExp.kind != literalKind {
fmt.Println("Only index checks on literals supported")
return nil
}
return &valueExp
}
And that's it for finding applicable indexes.
newTableFromSubset
The last remaining piece is to go from a boolean expression in
a WHERE
clause (where an index is applicable) to a subset
of rows in a table.
Since we are only working with patterns of the type
indexed-column OP literal-value
, we grab the literal
using the previous applicableValue
helper. Then we
look up that literal value in the index and return a new table with
every row in the index that meets the condition of the operator for the
literal value.
func (i *index) newTableFromSubset(t *table, exp expression) *table {
valueExp := i.applicableValue(exp)
if valueExp == nil {
return t
}
value, _, _, err := createTable().evaluateCell(0, *valueExp)
if err != nil {
fmt.Println(err)
return t
}
tiValue := treeItem{value: value}
indexes := []uint{}
switch symbol(exp.binary.op.value) {
case eqSymbol:
i.tree.AscendGreaterOrEqual(tiValue, func(i llrb.Item) bool {
ti := i.(treeItem)
if !bytes.Equal(ti.value, value) {
return false
}
indexes = append(indexes, ti.index)
return true
})
case neqSymbol:
i.tree.AscendGreaterOrEqual(llrb.Inf(-1), func(i llrb.Item) bool {
ti := i.(treeItem)
if bytes.Equal(ti.value, value) {
indexes = append(indexes, ti.index)
}
return true
})
case ltSymbol:
i.tree.DescendLessOrEqual(tiValue, func(i llrb.Item) bool {
ti := i.(treeItem)
if bytes.Compare(ti.value, value) < 0 {
indexes = append(indexes, ti.index)
}
return true
})
case lteSymbol:
i.tree.DescendLessOrEqual(tiValue, func(i llrb.Item) bool {
ti := i.(treeItem)
if bytes.Compare(ti.value, value) <= 0 {
indexes = append(indexes, ti.index)
}
return true
})
case gtSymbol:
i.tree.AscendGreaterOrEqual(tiValue, func(i llrb.Item) bool {
ti := i.(treeItem)
if bytes.Compare(ti.value, value) > 0 {
indexes = append(indexes, ti.index)
}
return true
})
case gteSymbol:
i.tree.AscendGreaterOrEqual(tiValue, func(i llrb.Item) bool {
ti := i.(treeItem)
if bytes.Compare(ti.value, value) >= 0 {
indexes = append(indexes, ti.index)
}
return true
})
}
newT := createTable()
newT.columns = t.columns
newT.columnTypes = t.columnTypes
newT.indexes = t.indexes
newT.rows = [][]memoryCell{}
for _, index := range indexes {
newT.rows = append(newT.rows, t.rows[index])
}
return newT
}
As you can see, an index may not necessarily improve on a linear
search in some conditions. Imagine a table of 1 million rows indexed
on an autoincrementing column. Imagine filtering on col >
10
. The index may be able to eliminate 10 items but still
return a pre-filtered table of around 1 million rows that must
be passed through the WHERE
filter.
Additionally since we process each boolean expression one at a time,
we can't take advantage of knowledge that might seem obvious to a
human for two boolean expressions that together bound a range. For
example in x > 10 AND x < 20
we can see that only
integers from 11 to 19 are applicable. But the current logic would go
through each expression separately and find all rows that match either
before the final linear search through all pre-filtered rows would
eliminate the bulk.
Thankfully real databases have decades of optimizations. But even
then it can be difficult to know what index usages are being
optimized without reading documentation, benchmarking, using
EXPLAIN ANALYSE
, or reading the source.
But that's it for changes needed to support basic indexes end-to-end!
Trialing an index
Since the addition of indexes is so seamless, it is difficult to tell without trial that the index is effective. So we write a simple program that inserts N rows with and without an index. Finally it will query for the first and last items inserted. We show time and memory used during both insertion and selection.
package main
import (
"fmt"
"os"
"runtime"
"strconv"
"time"
"github.com/eatonphil/gosql"
)
var inserts = 0
var lastId = 0
var firstId = 0
func doInsert(mb gosql.Backend) {
parser := gosql.Parser{}
for i := 0; i < inserts; i++ {
lastId = i
if i == 0 {
firstId = lastId
}
ast, err := parser.Parse(fmt.Sprintf("INSERT INTO users VALUES (%d)", lastId))
if err != nil {
panic(err)
}
err = mb.Insert(ast.Statements[0].InsertStatement)
if err != nil {
panic(err)
}
}
}
func doSelect(mb gosql.Backend) {
parser := gosql.Parser{}
ast, err := parser.Parse(fmt.Sprintf("SELECT id FROM users WHERE id = %d", lastId))
if err != nil {
panic(err)
}
r, err := mb.Select(ast.Statements[0].SelectStatement)
if err != nil {
panic(err)
}
if len(r.Rows) != 1 {
panic("Expected 1 row")
}
if int(*r.Rows[0][1].AsInt()) != inserts-1 {
panic(fmt.Sprintf("Bad row, got: %d", r.Rows[0][1].AsInt()))
}
ast, err = parser.Parse(fmt.Sprintf("SELECT id FROM users WHERE id = %d", firstId))
if err != nil {
panic(err)
}
r, err = mb.Select(ast.Statements[0].SelectStatement)
if err != nil {
panic(err)
}
if len(r.Rows) != 1 {
panic("Expected 1 row")
}
if int(*r.Rows[0][1].AsInt()) != 0 {
panic(fmt.Sprintf("Bad row, got: %d", r.Rows[0][1].AsInt()))
}
}
func perf(name string, b gosql.Backend, cb func(b gosql.Backend)) {
start := time.Now()
fmt.Println("Starting", name)
cb(b)
fmt.Printf("Finished %s: %f seconds\n", name, time.Since(start).Seconds())
var m runtime.MemStats
runtime.ReadMemStats(&m)
fmt.Printf("Alloc = %d MiB\n\n", m.Alloc/1024/1024)
}
func main() {
mb := gosql.NewMemoryBackend()
index := false
for i, arg := range os.Args {
if arg == "--with-index" {
index = true
}
if arg == "--inserts" {
inserts, _ = strconv.Atoi(os.Args[i+1])
}
}
primaryKey := ""
if index {
primaryKey = " PRIMARY KEY"
}
parser := gosql.Parser{}
ast, err := parser.Parse(fmt.Sprintf("CREATE TABLE users (id INT%s)", primaryKey))
if err != nil {
panic(err)
}
err = mb.CreateTable(ast.Statements[0].CreateTableStatement)
if err != nil {
panic(err)
}
indexingString := " with indexing enabled"
if !index {
indexingString = ""
}
fmt.Printf("Inserting %d rows%s\n", inserts, indexingString)
perf("INSERT", mb, doInsert)
perf("SELECT", mb, doSelect)
}
Build and run once without an index:
$ go build cmd/indextest/main.go
./main --inserts 1000000
Inserting 1000000 rows
Starting INSERT
Finished INSERT: 76.175133 seconds
Alloc = 239 MiB
Starting SELECT
Finished SELECT: 1.301556 seconds
Alloc = 238 MiB
And run again with an index:
./main --inserts 1000000 --with-index
Inserting 1000000 rows with indexing enabled
Starting INSERT
Finished INSERT: 89.108121 seconds
Alloc = 341 MiB
Starting SELECT
Finished SELECT: 0.000137 seconds
Alloc = 341 MiB
The basic tradeoff that you can see is that for more memory and longer insertion times, you get a significantly faster lookup.
Very excited to share the latest database basics post on implementing indexes in gosql.https://t.co/QHfjCe1XsC
— Phil Eaton (@phil_eaton) May 1, 2020