
Primary Keys and IDs

You may have noticed that our sample database tables all define an integer
column called id as their primary key. This is an Active Record convention.
“But wait!” you cry. “Shouldn’t the primary key of my orders table be the
order number or some other meaningful column? Why use an artificial
primary key such as id?”
The reason is largely a practical one—the format of external data may
change over time. For example, you might think that the ISBN of a book
would make a good primary key in a table of books. After all, ISBNs are
unique. But as this particular book is being written, the publishing industry
in the US is gearing up for a major change as additional digits are
added to all ISBNs.
If we’d used the ISBN as the primary key in a table of books, we’d have to
go through and update each row to reflect this change. But then we’d have
another problem. There’ll be other tables in the database that reference
rows in the books table via the primary key. We can’t change the key in the
books table unless we first go through and update all of these references.
And that will involve dropping foreign key constraints, updating tables,
updating the books table, and finally reestablishing the constraints. All in
all, something of a pain.
If we use our own internal value as a primary key, things work out a lot
better. No third party can come along and arbitrarily tell us to change
things—we control our own keyspace. And if something such as the ISBN
does need to change, it can change without affecting any of the existing
relationships in the database. In effect, we’ve decoupled the knitting
together of rows from the external representation of data in those rows.
Now there’s nothing to say that we can’t expose the id value to our end
users. In the orders table, we could externally call it an order id and print
it on all the paperwork. But be careful doing this—at any time some regulator
may come along and mandate that order ids must follow an externally
imposed format, and you’d be back where you started.
If you’re creating a new schema for a Rails application, you’ll probably
want to go with the flow and give all of your tables an id column as their
primary key. If you need to work with an existing schema, Active Record
gives you a simple way of overriding the default name of the primary key
for a table.
class BadBook < ActiveRecord::Base
set_primary_key "isbn"
Normally, Active Record takes care of creating new primary key values
for records that you create and add to the database—they’ll be ascending
integers (possibly with some gaps in the sequence). However, if you override
the primary key column’s name, you also take on the responsibility
of setting the primary key to a unique value before you save a new row.
Perhaps surprisingly, you still set an attribute called id to do this. As far as
As we’ll see later, join tables are not included in this advice—they should not have an id column.
Active Record is concerned, the primary key attribute is always set using
an attribute called id. The set_primary_key declaration sets the name of the
column to use in the table. In the following code, we use an attribute
called id even though the primary key in the database is isbn.
book = BadBook.new
book.id = "0-12345-6789"
book.title = "My Great American Novel"
# ...
book = BadBook.find("0-12345-6789")
puts book.title # => "My Great American Novel"
p book.attributes #=> {"isbn" =>"0-12345-6789",
"title"=>"My Great American Novel"}
Just to make things more confusing, the attributes of the model object
have the column names isbn and title—id doesn’t appear. When you need
to set the primary key, use id. At all other times, use the actual column

