Using UUID with PostgreSQL and ActiveRecord
Read in 3 minutes
PostgreSQL provides lots of column types and ActiveRecord maps some of them. One column type that I’m using more and more is uuid
.
The UUID has different implementations, but the most used version is v4. It generates random strings following the xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
format, where:
x
is hexadecimal digit.y
can be one of8
,9
,A
, orB
.
Let’s generate some identifiers using the uuidgen
command. Note how they follow the specification I just mentioned.
847862BF-424C-482F-81B1-09A48CFD547F
A378F7E1-94AD-432E-9DB1-41BADDBDAE8F
077570B7-63F5-4FB0-8A11-97BBF6998501
2B99E07E-36D4-4E07-9A6C-8DBEB66DA9BF
You can generate uuidv4 in Ruby too. All you have to do is using the SecureRandom.uuid
method, available as part of the standard library.
require 'securerandom'
SecureRandom.uuid
#=> ff039085-5e16-4d74-9039-4225533e1f35
You can also generate uuidv4 using PostgreSQL. First, you have to activate the uuid-ossp extension.
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
Then you can generate identifiers using the uuid_generate_v4()
function or define uuidv4 columns.
SELECT uuid_generate_v4();
-- cfe48d4f-246f-4f24-85b5-fb1d1a8a520e
CREATE TABLE users (
id uuid PRIMARY KEY NOT NULL DEFAULT uuid_generate_v4(),
name text NOT NULL
);
INSERT INTO users (name) VALUES ('John Doe');
-- SELECT * FROM users;
-- id | name
-- -------------------------------------+----------
-- e84f4f0e-2127-4e66-a0be-1463141ae7ae | John Doe
Using uuidv4 as your record identifier has some advantages. The first one isn’t obvious, but you can decentralize the id generation. Let’s say you have a client-side API consumer. Instead of waiting for your server to respond with success or error, you can assume that everything worked and asynchronously push local records to the server API.
Another advantage is that you won’t give tips about how many records your system has. It may seem a silly advantage, but hiding this info is important for some customers. It can also avoid leaking information about your stocks, as Twitter did today.
Implementing UUID in a Rails application
ActiveRecord’s PostgreSQL adapter has built-in support for uuid columns. All you have to do is using the uuid
method.
enable_extension 'uuid-ossp'
create_table :users do |t|
t.uuid :api_key, null: false, default: 'uuid_generate_v4()'
end
You can also define your primary keys as uuid columns.
create_table :users, id: :uuid do |t|
# ...
end
Finally, defining your association column is as simple as setting the :type
option.
create_table :posts, id: :uuid do |t|
t.references :user, type: :uuid, null: false, index: true
# or
t.belongs_to :user, type: :uuid, null: false, index: true
end
The same concept applies to the add_reference
method.
add_reference :posts, :users, type: :uuid, null: false, index: true
It’s unlikely that you have duplicated ids. But if that happen, an exception will be raised, since primary keys have an unique constraint.
Gotchas
Using uuid comes with some drawbacks. Some methods, like ActiveRecord::Base.first
and ActiveRecord::Base.last
won’t work as expected. ActiveRecords expects a sequential identifier, as you can see below.
User.first
#=> SELECT * FROM users ORDER id ASC LIMIT 1
If you need to use these methods, make sure you sort your results using the created_at
column or a sequential column (you can use a bigserial
column for this).
User.order(created_at: :asc).first
#=> SELECT * FROM users ORDER created_at ASC LIMIT 1
There’s more. The ActiveRecord::Base.find_in_batches
and ActiveRecord::Base.find_each
won’t work at all, since they ignore sorting scopes. If you need this functionality, you’ll have to implement it on your own.
Avoiding the boring parts
Setting the primary key and association column types every time is too error prone. To avoid doing this manually, I created a gem that overrides the default methods, adding the column type for you. All you have to do is adding the following line to your Gemfile and you’re done. No more id: :uuid
or type: :uuid
.
gem 'ar-uuid'
Wrapping up
Even with these ActiveRecord inconsistencies, you can easily use uuid as primary keys. Implementing a smarter behavior for ActiveRecord::Base.first
, ActiveRecord::Base.last
, ActiveRecord::Base.find_each
and ActiveRecord::Base.find_in_batches
wouldn’t be too hard; maybe checking the primary column type so that ActiveRecord could switch the sorting to a bigserial
column or even created_at
.
Do you use uuid as your primary keys? Do you other motivations for doing it so? Leave your comment!