Cross database queries in MongoDB

Until recently, I could accurately claim that I’ve spent more time hacking the source code to mongod, then writing code that made db calls to running instances of mongod. That was before I started my current project. For better or for worse, I’m approaching the point where I’m as comfortable with querying mongo collections as I am doing multi table joins in SQL server.

Naturally, as I use MongoDB I find myself asking a lot of “how do I do this in mongo” questions for tasks that I am able to do easily in SQL. More often than not, my main trouble in figuring out how to do the task in question is knowing what to ask google. Recently, my “How do I du jur” was cross database queries.

To define my problem more specifically, I had one document in one collection in my staging database that I wanted deployed to production. My staging and production databases lived on the same server. I realize this is not ideal, but it is the reality of my current situation. If I were to do the equivalent task in Microsoft SQL server, that is copy one row from a table in my staging database into one row in my production database, I’d use a query similar to the following:

USE productionDb
INSERT INTO tableName (id, name, subtitle, isBoolean, misc)
    SELECT id, name, subtitle, isBoolean, misc
        FROM stagingDb..tableName
        WHERE id='6f84dc60-fce6-11df-8cff-0800200c9a66'

A simple query for a simple task. It turns out the equivalent mongo query is about as simple. It just took me a while to find the right syntax, because the docs did not refer to it as a cross database query until I updated them. The shell command is db.getSisterDB(dbName). That functions returns an instance of another db on the server, which in turn contains collection objects that have the familar methods find(), findOne(), update(), save(), remove(), etc. So I did the following:
        .findOne({_id: ObjectId("4cfc182f2c320000000013b4")}))

and my object was copied over.

However, there is one caveat to be aware of. Most drivers allow you the full range of bson data types. The shell does not. For example a 32 bit int in a mongo document becomes a double in the shell. So the data is not copied perfectly. I discovered this issue while using a pre-release of the official 1ogen CSharp driver for MongoDB. After some update queries in the shell, objects were not being de-serialized. Luckily, the great people at 10gen made the driver more tolerant on deserialization so this is no longer a problem with current builds of the driver. There are open tickets to add shell support for the missing data types (int32s and GUIDs), so the deficiency of the shell will be addressed. However, until then, be aware of the caveat I mentioned.