Chan Chen Coding...

Optimizing Object IDs

The _id field in a MongoDB document is very important and is always indexed for normal collections. This page lists some recommendations. Note that it is common to use the BSON ObjectID datatype for _id's, but the values of an _id field can be of any type.

Use the collections 'natural primary key' in the _id field.

_id's can be any type, so if your objects have a natural unique identifier, consider using that in _id to both save space and avoid an additional index.

When possible, use _id values that are roughly in ascending order.

If the _id's are in a somewhat well defined order, on inserts the entire b-tree for the _id index need not be loaded. BSON ObjectIds have this property.

Store Binary GUIDs as BinData, rather than as hex encoded strings

BSON includes a binary data datatype for storing byte arrays. Using this will make the id values, and their respective keys in the _id index, twice as small.

Note that unlike the BSON Object ID type (see above), most UUIDs do not have a rough ascending order, which creates additional caching needs for their index.

> // mongo shell bindata info: 
> help misc
    b = new BinData(subtype,base64str)     create a BSON BinData value     
    b.subtype()     the BinData subtype (0..255)
    b.length()     length of the BinData data in bytes
    b.hex()     the data as a hex encoded string
    b.base64()     the data as a base 64 encoded string
    b.toString()
Extract insertion times from _id rather than having a separate timestamp field.

The BSON ObjectId format provides documents with a creation timestamp (one second granularity) for free. Almost all drivers implement methods for extracting these timestamps; see the relevant api docs for details. In the shell:

> // mongo shell ObjectId methods 
> help misc
    o = new ObjectId() create a new ObjectId
    o.getTimestamp() return timestamp derived from first 32 bits of the OID
    o.isObjectId()
    o.toString()
    o.equals(otherid)
Sort by _id to sort by insertion time

BSON ObjectId's begin with a timestamp. Thus sorting by _id, when using the ObjectID type, results in sorting by time. Note: granularity of the timestamp portion of the ObjectID is to one second only.

> // get 10 newest items 
> db.mycollection.find().sort({id:-1}).limit(10);


-----------------------------------------------------
Silence, the way to avoid many problems;
Smile, the way to solve many problems;

posted on 2012-02-27 14:06 Chan Chen 阅读(248) 评论(0)  编辑  收藏 所属分类: DB


只有注册用户登录后才能发表评论。


网站导航: