MongoDB Queries and Projections

Methods, Queries, and Projections

In any MongoDB call—often referred to as a “MongoDB query”—we have four parts:

  1. Collection Name – The set of documents we’d like to look at.
  2. Collection Method – The MongoDB method that we’ll use to look for documents.
  3. Query – A document describing what we’re looking for in our collection.
  4. Projection – Another document describing how we want the data matching our query to be returned to us.

To make that a little clearer, here’s a diagram:

Structure of a MongoDB query.

In this example, we start with our collection name Posts using the Find collection method. Then, in the first argument to our Find method, we pass a Query document containing the criteria for our search. Finally, we have a Projection which tells our Find selection, “for all of the documents that you find, when you return them, make sure they only have the title field.”

The first two parts here are required (collection and collection method), while the query and projection arguments are optional. Further, this is a pretty simple MongoDB call. As we’ll learn in a bit, both queries and projections can take on several different shapes and serve several purposes. While the bulk of our MongoDB calls are likely to be simple like the example above, we may have situations where we need to be as specific as possible about what data we’re getting back.

Collection names are pretty clear, so let’s look at the three tools we use to actually retrieve data from our collection on their own: Methods, Queries, and Projections.

JUST SELECTIONS

Hopefully it’s clear at this point, but the remainder of this snippet will only look at selectingexisting data in MongoDB, not handling inserts or updates. They involve different techniques, so we’ll cover those in a future snippet.

Using MongoDB Collection Methods for selection

In Meteor, we have access to two selection methods when we’re making calls on our database:find() and findOne(). As their names imply, find() is designed to return multiple documents—though it may return just one depending on the query—while findOne() is designed to returna single document.

The find() method

The actual difference between these two methods is that the find() method returns something known as a MongoDB Cursor. From the MongoDB Glossary, a Cursor is defined as:

A pointer to the result set of a query. Clients can iterate through a cursor to retrieve results.

— via MongoDB Glossary

Yeah, okay. What is it really? What it says! Think about the mouse pointer on screen (also known as a cursor). When you hover over something you’re pointing at it. This is the basic idea of a cursor. Based on the query and projection that we pass to our selection method, we’ll get back a cursor which points to the documents in our collection that match that query and projection. Neat! In less ambiguous terms, we can see a cursor when we type Collection.find() into our console.

Example of a MongoDB cursor.
Example of a MongoDB cursor.

Cool! But wait…this is just a bunch of properties and methods. Where is our data? Good question. If we drill down on that collection property a bit, we can find our actual data is returned inside of two sub documents: collection._docs._map. It looks something like this:

Example of the documents map in a MongoDB cursor.
Example of the documents map in a MongoDB cursor.

Interesting…making some sense now? So, here, we can see that we get back a whole list of documents—or map—based on the query and projection we pass to MongoDB. Actually, in this example, we’ve simply called Documents.find() which tells MongoDB to return us everything in a collection.

More specifically, when calling this on the client, we’re telling minimongo (Meteor’s client-side implementation of the MongoDB API) to give us all of the documents published to the client for this collection. This means that if our publication on the server only returned, say, 5 documents, the list in that image would only be five documents long.

As you might imagine, this isn’t great for performance. Before we look at how to prevent getting back a waterfall of documents, though, let’s see how a findOne() behaves so it’s crystal clear.

The findOne() method

So, piggybacking on our example above, what does it look like if we just callDocuments.findOne()? Well, exactly what you’d expect:

BROWSER CONSOLE

// Input: Documents.findOne()
Object {_id: "XZAxvr66A9z49LDjQ", title: "Document #1"}

Neat. So…we get back a single object! In this case, we get back the first document in our collection. Depending on how you write your publications this may be useful, but in a larger set of documents we need to get more specific. To do this, let’s dig into queries and see how those work.

The MongoDB Query Document

Recall that earlier we learned that the first argument passed to our collection method is known as a query. More specifically, MongoDB acknowledges this as a query document. The basic idea behind this query document is to tell MongoDB what documents we want to retrieve. Let’s look at some code examples so this is a little more concrete.

Find all documents with the title “Tasty Pies in July”

Don’t ask about the title of our post. Just go with it. If we wanted to find all of the documents in a collection with the title "Tasty Pies in July", our MongoDB call would look like this:

Documents.find( { "title": "Tasty Pies in July" } );

In this case, we’d hope that we only have one post with this name, so we should expect to only get back one document. Notice that we use the find() method here, but if we’re only expecting one document, it’s more efficient to just say Documents.findOne( { "title": "Tasty Pies in July" } );. Remember that the difference here is that with a find() we get back an entire cursor, but with a findOne() we get back a single document.

Find all documents using multiple fields

What’s neat about MongoDB queries is that we can pass multiple fields. This means that we can have infinitely complex queries which makes it really nice for looking up data in our collections. Let’s say we had the following collection of documents:

{ "owner": "123456", "category": "photos", "date.year": 2014 }
{ "owner": "123456", "category": "recipes", "date.year": 2015 }
{ "owner": "123456", "category": "recipes", "date.year": 2014 }
{ "owner": "123456", "category": "writing", "date.year": 2014 }
{ "owner": "123456", "category": "recipes", "date.year": 2015 }

And the following query:

Documents.find( { "owner": "123456", "category": "recipes", "date.year": 2015 } );

Woah! In this example, we can see our query is taking on three fields. Here, we’re saying “give me all of the documents where the owner field (some user) is equal to the ID 123456, the document is in the category ‘recipes’, and the publication year date.year is 2015.”

Based on our query above, we should only get two of these documents returned to us. Note: all of these documents are owned by the same user, but because their categories and years published vary, we’ll only get back the one’s our query has specified.

// Result of our query:
{ "owner": "123456", "category": "recipes", "date.year": 2015 }
{ "owner": "123456", "category": "recipes", "date.year": 2015 }

Advanced querying

This is all cool, but we can go even further. Let’s setup some example documents and then write an advanced query to pick only the documents that we want.

{ "owner": "123456", "title": "New Year Photos", "category": "photos", "date.year": 2014, "tags": [ 'baby', 'wedding', 'dog' ] }
{ "owner": "123456", "title": "Taco Party!", "category": "recipes", "date.year": 2015, "tags": [ 'tacos', 'fiesta', 'salsa' ] }
{ "owner": "123456", "title": "We're Having a Baby!", "category": "announcements", "date.year": 2013, "tags": [ 'baby', 'wedding', 'dog' ] }
{ "owner": "123456", "title": "New Winter Gear", "category": "announcements", "date.year": 2013, "tags": [ 'hats', 'coats', 'gloves' ] }
{ "owner": "123456", "title": "Baby's First Birthday", "category": "photos", "date.year": 2016, "tags": [ 'baby', 'wedding', 'dog' ] }
{ "owner": "123456", "title": "Rusty the Puppy", "category": "announcements", "date.year": 2016, "tags": [ 'baby', 'wedding', 'dog' ] }

We only want to get two documents out of this set: the one’s where the owner is “123456”,category is “photos” or “announcements”, the year published is before 2015, and where thetags are equal to “baby” and “wedding”. Holy cow. This is a lot of stuff, but there’s some good news: we can do this with a single query.

Documents.find({ 
  "owner": "123456",
  "category": { $in: [ "photos", "announcements" ] },
  "date.year": { $lt: 2015 },
  "tags": { $in: [ 'baby', 'wedding' ] }
});

If we’ve done our job well, we should get back this result set:

{ "owner": "123456", "title": "New Year Photos", "category": "photos", "date.year": 2014, "tags": [ 'baby', 'wedding', 'dog' ] }
{ "owner": "123456", "title": "We're Having a Baby!", "category": "announcements", "date.year": 2013, "tags": [ 'baby', 'wedding', 'dog' ] }

First, we get all of the documents where the owner field matches “123456”. Easy. Next, we use the MongoDB $in operator to say, “give us all of the documents where the value of category is equal to one of the values in this array we’re passing. Next, we narrow our query down by saying give us all of the documents where the value of date.year is less than 2015, using theMongoDB $lt operator. Finally, we use the $in operator again to check that our tags field hasbaby or wedding in it (yes, $in works to see if the values in the array we’ve passed are in an array set to that field).

Wow, wow, wow! Isn’t that cool?! With a single query, we managed to get hyper-specific with what we wanted in a collection. Even more exciting is that this is just the tip of the iceberg. We have access to a lot of different operators in MongoDB that can help us whittle down our search results.

Okay. That was pretty wild, but now that we know how to select documents, let’s see how we can further trim down the results we get using projections.

The MongoDB Projection Document

In the second argument passed to our collection method we can define something known as aprojection document. That name is a little squirrly, but in essence this document allows us to say “given the results that you found with our query, let’s take that result set and filter it further.” Cool! Again, code is easiest here. Let’s piggyback on our last example we used while looking at how to define advanced MongoDB Queries.

Again, we have this massive set of documents and only want to get two back.

// Our entire document set.
{ "owner": "123456", "title": "New Year Photos", "category": "photos", "date.year": 2014, "tags": [ 'baby', 'wedding', 'dog' ] }
{ "owner": "123456", "title": "Taco Party!", "category": "recipes", "date.year": 2015, "tags": [ 'tacos', 'fiesta', 'salsa' ] }
{ "owner": "123456", "title": "We're Having a Baby!", "category": "announcements", "date.year": 2013, "tags": [ 'baby', 'wedding', 'dog' ] }
{ "owner": "123456", "title": "New Winter Gear", "category": "announcements", "date.year": 2013, "tags": [ 'hats', 'coats', 'gloves' ] }
{ "owner": "123456", "title": "Baby's First Birthday", "category": "photos", "date.year": 2016, "tags": [ 'baby', 'wedding', 'dog' ] }
{ "owner": "123456", "title": "Rusty the Puppy", "category": "announcements", "date.year": 2016, "tags": [ 'baby', 'wedding', 'dog' ] }

// The result we got using our query.
{ "owner": "123456", "title": "New Year Photos", "category": "photos", "date.year": 2014, "tags": [ 'baby', 'wedding', 'dog' ] }
{ "owner": "123456", "title": "We're Having a Baby!", "category": "announcements", "date.year": 2013, "tags": [ 'baby', 'wedding', 'dog' ] }

Let’s modify the query we used to return those last two documents a bit. Right now, we’re getting back all of the fields in that document. What if we only wanted to get the title of the documents that match this query?

Documents.find({ 
  "owner": "123456",
  "category": { $in: [ "photos", "announcements" ] },
  "date.year": { $lt: 2015 },
  "tags": { $in: [ 'baby', 'wedding' ] }
}, {
  fields: {
    "title": 1
  }
});

For comparison, here’s the result of our query before we add this projection, and then after:

// Before we added our projection.
{ "owner": "123456", "title": "New Year Photos", "category": "photos", "date.year": 2014, "tags": [ 'baby', 'wedding', 'dog' ] }
{ "owner": "123456", "title": "We're Having a Baby!", "category": "announcements", "date.year": 2013, "tags": [ 'baby', 'wedding', 'dog' ] }

// After we add our project.
{ "_id": 123, "title": "New Year Photos" }
{ "_id": 456, "title": "We're Having a Baby!" }

Woah. The utility in this may not be entirely clear. Say we had a part of our interface where a user could toggle some filters. Those filters allowed us to produce a query like the one above (looking for category, year published, etc). When we go to output the list of posts that match that query, as a developer, we know that in order to display a list of posts back to our user, we only need the _id of that document (to link to) and its title.

But wait…where did that _id come from?

THE _ID FIELD IS ALWAYS RETURNED UNLESS SPECIFIED

By design, unless we explicitly pass { fields: { "_id": 0 } } to our query, we will always get back an _id field.

We can take this example even further just like our queries and pass multiple projection operations to get as specific as possible:

Documents.find({ 
  "owner": "123456",
  "category": { $in: [ "photos", "announcements" ] },
  "date.year": { $lt: 2015 },
  "tags": { $in: [ 'baby', 'wedding' ] }
}, {
  fields: {
    "title": 1
  },
  sort: {
    "title": -1
  }
});

Now, we get our result back sorted by the title in reverse chronological order:

{ "_id": 456, "title": "We're Having a Baby!" }
{ "_id": 123, "title": "New Year Photos" }

Pretty crazy. But wait…what is this 1 and -1 stuff about? Admittedly, this is pretty confusing at first.

In our fields projection, 1 is equal to true. In Mongo-speak, this means that it’s “true” that we should only send back the title field. Let that soak in. If we were to change this to a 0, orfalse, we’d be saying “send back everything in this document but the title field.”

The latter sort setting title to a -1 is pure witchcraft on the part of MongoDB. As explained in the documentation for sort:

Specify in the sort parameter the field or fields to sort by and a value of 1 or -1 to specify an ascending or descending sort respectively.

— via MongoDB Sort Documentation

So, if we set a field to 1, that means that we’ll get back our result set sorted by that field in ascending order (1 2 3 4 5). If we change that to -1, we get our result set sorted by that field in descending order (5 4 3 2 1).

Again, just like our queries, there are a lot of different types of projections that we can pass in Meteor in order to filter our results.

Using Queries and Projections

While we’ve explained how to use queries and projections in a general sense in this snippet, we haven’t really talked about where they’re best applied. All of the code we’ve seen here can be applied on the client and the server.

In reality, though, queries and projections are best suited for use on the server, and more specifically, within your publications. The reason for that has to do with the nature of publications and how they control what data gets to the client from the server. Think of your publication as the strainer for your data. It acts as the middleman between the client and the server in respect to your data, limiting what actually gets sent to the client.

It’s best to apply the techniques we’ve outlined above before your data hits the client. Why? Because this will ensure that we’re only sending the client what it really needs at any given point in time. We could perform these same operations on the client, however, we’d being doing so on a much larger data set that we had to burden our client (read: user) with first.

Takeaways

  • When pulling data into your app, think carefully about what queries and projections you can use to slim down your result sets to only what you need.
  • Using queries and projections wisely on the server can save your clients a lot of unnecessary overhead.
  • Remember to consider whether a find() (entire cursor) or findOne() (single document) is necessary when querying data.

FROM HERE

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s