PostgreSQL provides lots of column types and ActiveRecord maps some of them. One column type that I’m using more and more is
The UUID has different implementations, but the most used version is v4. It generates random strings following the
xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx format, where:
xis hexadecimal digit.
ycan be one of
Let’s generate some identifiers using the
uuidgen command. Note how they follow the specification I just mentioned.
847862BF-424C-482F-81B1-09A48CFD547F A378F7E1-94AD-432E-9DB1-41BADDBDAE8F 077570B7-63F5-4FB0-8A11-97BBF6998501 2B99E07E-36D4-4E07-9A6C-8DBEB66DA9BF
You can generate uuidv4 in Ruby too. All you have to do is using the
SecureRandom.uuid method, available as part of the standard library.
require 'securerandom' SecureRandom.uuid #=> ff039085-5e16-4d74-9039-4225533e1f35
You can also generate uuidv4 using PostgreSQL. First, you have to activate the uuid-ossp extension.
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
Then you can generate identifiers using the
uuid_generate_v4() function or define uuidv4 columns.
SELECT uuid_generate_v4(); -- cfe48d4f-246f-4f24-85b5-fb1d1a8a520e CREATE TABLE users ( id uuid PRIMARY KEY NOT NULL DEFAULT uuid_generate_v4(), name text NOT NULL ); INSERT INTO users (name) VALUES ('John Doe'); -- SELECT * FROM users; -- id | name -- -------------------------------------+---------- -- e84f4f0e-2127-4e66-a0be-1463141ae7ae | John Doe
Using uuidv4 as your record identifier has some advantages. The first one isn’t obvious, but you can decentralize the id generation. Let’s say you have a client-side API consumer. Instead of waiting for your server to respond with success or error, you can assume that everything worked and asynchronously push local records to the server API.
Another advantage is that you won’t give tips about how many records your system has. It may seem a silly advantage, but hiding this info is important for some customers. It can also avoid leaking information about your stocks, as Twitter did today.
Implementing UUID in a Rails application
ActiveRecord’s PostgreSQL adapter has built-in support for uuid columns. All you have to do is using the
enable_extension 'uuid-ossp' create_table :users do |t| t.uuid :api_key, null: false, default: 'uuid_generate_v4()' end
You can also define your primary keys as uuid columns.
create_table :users, id: :uuid do |t| # ... end
Finally, defining your association column is as simple as setting the
create_table :posts, id: :uuid do |t| t.references :user, type: :uuid, null: false, index: true # or t.belongs_to :user, type: :uuid, null: false, index: true end
The same concept applies to the
add_reference :posts, :users, type: :uuid, null: false, index: true
It’s unlikely that you have duplicated ids. But if that happen, an exception will be raised, since primary keys have an unique constraint.
Using uuid comes with some drawbacks. Some methods, like
ActiveRecord::Base.last won’t work as expected. ActiveRecords expects a sequential identifier, as you can see below.
User.first #=> SELECT * FROM users ORDER id ASC LIMIT 1
If you need to use these methods, make sure you sort your results using the
created_at column or a sequential column (you can use a
bigserial column for this).
User.order(created_at: :asc).first #=> SELECT * FROM users ORDER created_at ASC LIMIT 1
There’s more. The
ActiveRecord::Base.find_each won’t work at all, since they ignore sorting scopes. If you need this functionality, you’ll have to implement it on your own.
Avoiding the boring parts
Setting the primary key and association column types every time is too error prone. To avoid doing this manually, I created a gem that overrides the default methods, adding the column type for you. All you have to do is adding the following line to your Gemfile and you’re done. No more
id: :uuid or
Even with these ActiveRecord inconsistencies, you can easily use uuid as primary keys. Implementing a smarter behavior for
ActiveRecord::Base.find_in_batches wouldn’t be too hard; maybe checking the primary column type so that ActiveRecord could switch the sorting to a
bigserial column or even
Do you use uuid as your primary keys? Do you other motivations for doing it so? Leave your comment!