The most popular SQLite and PostgreSQL database drivers in Go are
(roughly) 20-76% slower than alternative Go drivers on insert-heavy
benchmarks of mine. So if you are bulk-inserting data with Go (and
potentially also bulk-retrieving data with Go), you may want to
consider the driver carefully. And you may want to consider avoiding
database/sql
.
Some driver authors have noted and benchmarked issues with database/sql.
So it may be the case that database/sql
is responsible for some of
this overhead. And indeed the variations between drivers in this post
will be demonstrated by using database/sql
and avoiding it. This post
won't specifically prove that the variation is due to the
database/sql
interface. But that doesn't change the premise.
Not covered in this post but something to consider:
JetBrains has
suggested that other frontends like sqlc, sqlx, and GORM do
worse than database/sql
.
This post is built on the workload, environment, libraries, and methodology in my databases-intuition repo on GitHub. See the repo for details that will help you reproduce or correct me.
INSERT workload
In this workload, the data is random and there are no indexes. Neither of these aspects matter for this post though because we're comparing behavior within the same database among different drivers. This was just a workload I already had.
Two different data sizes are tested:
- 10M rows with 16 columns, each column is 32 bytes
- 10M rows with 3 columns, each column is 8 bytes
Each test is run 10 times and we record median, standard deviation, min, max and throughput.
SQLite
Both variations presented here load 10M rows using a single prepared statement called for each row within a single transaction.
The most popular driver is mattn/go-sqlite3.
It is roughly 20-40% slower than another driver that avoids
database/sql
.
10M Rows, 16 columns, each column 32 bytes:
Timing: 56.53 ± 1.26s, Min: 55.05s, Max: 59.62s
Throughput: 176,893.65 ± 3,853.90 rows/s, Min: 167,719.97 rows/s, Max: 181,646.02 rows/s
10M Rows, 3 columns, each column 8 bytes:
Timing: 15.92 ± 0.25s, Min: 15.69s, Max: 16.67s
Throughput: 628,044.37 ± 9,703.92 rows/s, Min: 599,852.91 rows/s, Max: 637,435.60 rows/s
The other driver I tested is my own fork of bvinc/go-sqlite-lite called eatonphil/gosqlite. I forked it because it is unmaintained and I wanted to bring it up-to-date for tests like this.
10M Rows, 16 columns, each column 32 bytes:
Timing: 45.51 ± 0.70s, Min: 43.72s, Max: 45.93s
Throughput: 219,729.65 ± 3,447.56 rows/s, Min: 217,742.98 rows/s, Max: 228,711.51 rows/s
10M Rows, 3 columns, each column 8 bytes:
Timing: 10.44 ± 0.20s, Min: 10.02s, Max: 10.68s
Throughput: 957,939.60 ± 18,879.43 rows/s, Min: 936,114.60 rows/s, Max: 998,426.62 rows/s
PostgreSQL
Both variations presented use PostgreSQL's COPY
FROM
support. This is significantly faster for PostgreSQL than doing the
prepared statement we do in
SQLite. (Here
are my results for doing prepared statement INSERTs in PostgreSQL if
you are curious.)
The most popular PostgreSQL driver is lib/pq. The performance issues with lib/pq are well-known, and the repo itself is marked as no longer developed.
It is roughly 44-76% slower than an alternative driver that avoids
database/sql
.
10M Rows, 16 columns, each column 32 bytes:
Timing: 104.53 ± 2.40s, Min: 102.57s, Max: 110.08s
Throughput: 95,665.37 ± 2,129.25 rows/s, Min: 90,847.08 rows/s, Max: 97,490.96 rows/s
10M Rows, 3 columns, each column 8 bytes:
Timing: 8.16 ± 0.43s, Min: 7.44s, Max: 8.80s
Throughput: 1,225,986.47 ± 66,631.53 rows/s, Min: 1,136,581.82 rows/s, Max: 1,343,441.37 rows
The other driver I tested is
jackc/pgx, without database/sql
.
10M Rows, 16 columns, each column 32 bytes:
Timing: 46.54 ± 1.60s, Min: 44.09s, Max: 49.51s
Throughput: 214,869.42 ± 7,265.10 rows/s, Min: 201,991.37 rows/s, Max: 226,801.07 rows/s
10M Rows, 3 columns, each column 8 bytes:
Timing: 5.20 ± 0.44s, Min: 4.71s, Max: 5.96s
Throughput: 1,923,722.79 ± 156,820.46 rows/s, Min: 1,676,894.32 rows/s, Max: 2,124,966.60 rows/
The discrepancies here are even greater than with the different SQLite drivers.
Workloads with small resultset
I won't go into as much detail but if you're doing queries that don't return many rows, the difference between drivers is negligible.
See here for details.
Conclusion
If you are doing INSERT-heavy workloads, or you are processing large number of rows returned from your SQL database, you might want to try benchmarking the same workload with different drivers.
And specifically, there is likely no good reason to use lib/pq
anymore for accessing PostgreSQL from Go. Just use jackc/pgx.
For INSERT-heavy workloads in Go, you may want to switch database drivers. For PostgreSQL and SQLite, the popular drivers are 20-76% slower for this workload in my tests.
— Phil Eaton (@eatonphil) October 6, 2023
Some driver developers have reported issues with database/sql as an interface.https://t.co/NLVp0P2uiV pic.twitter.com/RxTbgMZ1MG