paulwong

How to delete large amount of data of a MongoDB collection “quickly”

We have a db collection that is around 30 million documents, and I need to trim it down, to only keeping the documents created on the last month. 

One approach would be use the remove command with a condition on the created_at field (the collection already have an index on this field):

db.my_collection.remove({created_at: {$lte: new Date("11/01/2012")}});

But this approach will be very slow, instead of that, a better way to do it is rename the current collection (for instance to “old_collection”) using renameCollection. Then performing a query-and-insert from the “old_collection” into “my_collection”:

db.my_collection.renameCollection("old_collection");  
db.createCollection("my_collection");
db.my_collection.createIndex(...); // recreate the indexes for the collection
// copy docs from old collection into the new collection
db.old_collection.find(
{created_at: {$gte: new Date("11/01/2012")}} ).sort({_id: -1}).forEach(
function(row) { db.my_collection.insert(row); } ); // drop old collection db.old_collection.drop();

This approach is typically faster than running a bunch of removes on your data

posted on 2015-12-10 20:09 paulwong 阅读(533) 评论(0)  编辑  收藏 所属分类: MONGODB


只有注册用户登录后才能发表评论。


网站导航: