Index Creation

Index Creation

MongoDB provides several options that only affect the creation of the index. Specify these options in a document as the second argument to the db.collection.ensureIndex() method. This section describes the uses of these creation options and their behavior. Some options that you can specify to ensureIndex() options control the properties of the index, which are not index creation options. For example, the unique option affects the behavior of the index after creation.

Background Construction – By default, creating an index blocks all other operations on a database. When building an index on a collection, the database that holds the collection is unavailable for read or write operations until the index build completes. Any operation that requires a read or write lock on all databases (e.g. listDatabases) will wait for the foreground index build to complete.

For potentially long running index building operations, consider the background operation so that the MongoDB database remains available during the index building operation. For example, to create an index in the background of the zipcode field of the people collection, issue the following:

db.people.ensureIndex( { zipcode: 1}, {background: true} )

By default, background is false for building MongoDB indexes. You can combine the background option with other options, as in the following

db.people.ensureIndex( { zipcode: 1}, {background: true, sparse: true } )

Behavior – As of MongoDB version 2.4, a mongod instance can build more than one index in the background concurrently. Changed in version 2.4: Before 2.4, a mongod instance could only build one background index per database at a time. Changed in version 2.2: Before 2.2, a single mongod instance could only build one index at a time.

Background indexing operations run in the background so that other database operations can run while creating the index. However, the mongo shell session or connection where you are creating the index will block until the index build is complete. To continue issuing commands to the database, open another connection or mongo instance.

Queries will not use partially-built indexes: the index will only be usable once the index build is complete. If MongoDB is building an index in the background, you cannot perform other administrative operations involving that collection, including running repairDatabase, dropping the collection (i.e. db.collection.drop()), and running compact. These operations will return an error during background index builds.

Performance – The background index operation uses an incremental approach that is slower than the normal “foreground” index builds. If the index is larger than the available RAM, then the incremental process can take much longer than the foreground build. If your application includes ensureIndex() operations, and an index doesn’t exist for other operational concerns, building the index can have a severe impact on the performance of the database. To avoid performance issues, make sure that your application checks for the indexes at start up using the getIndexes() method or the equivalent method for your driver and terminates if the proper indexes do not exist. Always build indexes in production instances using separate application code, during designated maintenance windows.

Building Indexes on Secondaries – Changed in version 2.6: Secondary members can now build indexes in the background. Previously all index builds on secondaries were in the foreground. Background index operations on a replica set secondaries begin after the primary completes building the index. If MongoDB builds an index in the background on the primary, the secondaries will then build that index in the background.

To build large indexes on secondaries the best approach is to restart one secondary at a time in standalone mode and build the index. After building the index, restart as a member of the replica set, allow it to catch up with the other members of the set, and then build the index on the next secondary. When all the secondaries have the new index, step down the primary, restart it as a standalone, and build the index on the former primary. The amount of time required to build the index on a secondary must be within the window of the oplog, so that the secondary can catch up with the primary. Indexes on secondary members in “recovering” mode are always built in the foreground to allow them to catch up as soon as possible.

Drop Duplicates – MongoDB cannot create a unique index on a field that has duplicate values. To force the creation of a unique index, you can specify the dropDups option, which will only index the first occurrence of a value for the key, and delete all subsequent values. As in all unique indexes, if a document does not have the indexed field, MongoDB will include it in the index with a “null” value.

If subsequent fields do not have the indexed field, and you have set {dropDups: true}, MongoDB will remove these documents from the collection when creating the index. If you combine dropDups with the sparse option, this index will only include documents in the index that have the value, and the documents without the field will remain in the database. To create a unique index that drops duplicates on the username field of the accounts collection, use a command in the following form

db.accounts.ensureIndex( { username: 1 }, { unique: true, dropDups: true } )

Specifying { dropDups: true } will delete data from your database. Use with extreme caution. By default, dropDups is false.

Index Names – The default name for an index is the concatenation of the indexed keys and each key’s direction in the index, 1 or -1. As an example, issue the following command to create an index on item and quantity

db.products.ensureIndex( { item: 1, quantity: -1 } )

The resulting index is named: item_1_quantity_-1. Optionally, you can specify a name for an index instead of using the default name. As an example issue the following command to create an index on item and quantity and specify inventory as the index name:

db.products.ensureIndex( { item: 1, quantity: -1 } , { name: “inventory” } )

The resulting index has the name inventory. To view the name of an index, use the getIndexes() method.

Create an Index – Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the documents in a collection. MongoDB creates an index on the _id field of every collection by default, but allows users to create indexes for any collection using on any field in a document.

This tutorial describes how to create an index on a single field. MongoDB also supports compound indexes, which are indexes on multiple fields.

Create an Index on a Single Field – To create an index, use ensureIndex() or a similar method from your driver. For example the following creates an index on the phone-number field of the people collection

db.people.ensureIndex( { “phone-number”: 1 } )

ensureIndex() only creates an index if an index of the same specification does not already exist. All indexes support and optimize the performance for queries that select on this field. For queries that cannot use an index, MongoDB must scan all documents in a collection for documents that match the query. The value of the field in the index specification describes the kind of index for that field. For example, a value of 1 specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in descending order. As an example, if you create an index on the user_id field in the records, this index is, the index will support the following query

db.records.find( { user_id: 2 } )

However, the following query, on the profile_url field is not supported by this index

db.records.find( { profile_url: 2 } )

If your collection holds a large amount of data, and your application needs to be able to access the data while building the index, consider building the index in the background, as described in Background Construction. Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any affect on the resulting index.

Create a Compound Index – Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the documents in a collection. MongoDB supports indexes that include content on a single field, as well as compound indexes that include content from multiple fields. Continue reading for instructions and examples of building a compound index.

Build a Compound Index – To create a compound index use an operation that resembles the following prototype

db.collection.ensureIndex( { a: 1, b: 1, c: 1 } )

As an example, the following operation will create an index on the item, category, and price fields of the products collection:

db.products.ensureIndex( { item: 1, category: 1, price: 1 } )

If your collection holds a large amount of data, and your application needs to be able to access the data while building the index, consider building the index in the background, as described in Background Construction. Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any affect on the resulting index. The value of the field in the index specification describes the kind of index for that field. For example, a value of 1 specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in descending order.

Create a Unique Index – MongoDB allows you to specify a unique constraint on an index. These constraints prevent applications from inserting documents that have duplicate values for the inserted fields. Additionally, if you want to create an index on a collection that has existing data that might have duplicate values for the indexed field, you may choose to combine unique enforcement with duplicate dropping.

Unique Indexes – To create a unique index, consider the following prototype

db.collection.ensureIndex( { a: 1 }, { unique: true } )

For example, you may want to create a unique index on the “tax-id”: of the accounts collection to prevent storing multiple account records for the same legal entity

db.accounts.ensureIndex( { “tax-id”: 1 }, { unique: true } )

The _id index is a unique index. In some situations you may consider using the _id field itself for this kind of data rather than using a unique index on another field. In many situations you will want to combine the unique constraint with the sparse option. When MongoDB indexes a field, if a document does not have a value for a field, the index entry for that item will be null. Since unique indexes cannot have duplicate values for a field, without the sparse option, MongoDB will reject the second document and all subsequent documents without the indexed field. Consider the following prototype.

db.collection.ensureIndex( { a: 1 }, { unique: true, sparse: true } )

You can also enforce a unique constraint on compound indexes, as in the following prototype

db.collection.ensureIndex( { a: 1, b: 1 }, { unique: true } )

These indexes enforce uniqueness for the combination of index keys and not for either key individually.

Drop Duplicates – To force the creation of a unique index on a collection with duplicate values in the field you are indexing you can use the dropDups option. This will force MongoDB to create a unique index by deleting documents with duplicate values when building the index. Consider the following prototype invocation of ensureIndex()

db.collection.ensureIndex( { a: 1 }, { unique: true, dropDups: true } )

Specifying { dropDups: true } may delete data from your database. Use with extreme caution.

Create a Sparse Index – Sparse indexes are like non-sparse indexes, except that they omit references to documents that do not include the indexed field. For fields that are only present in some documents sparse indexes may provide a significant space savings.

Prototype – To create a sparse index on a field, use an operation that resembles the following prototype

db.collection.ensureIndex( { a: 1 }, { sparse: true } )

As an example, the following operation, creates a sparse index on the user’s collection that only includes a document in the index if the twitter_name field exists in a document.

db.users.ensureIndex( { twitter_name: 1 }, { sparse: true } )

The index excludes all documents that do not include the twitter_name field. Sparse indexes can affect the results returned by the query, particularly with respect to sorts on fields not included in the index.

Create a Hashed Index – It is new in version 2.4. Hashed indexes compute a hash of the value of a field in a collection and index the hashed value. These indexes permit equality queries and may be suitable shard keys for some collections. MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do not need to compute hashes.

Procedure – To create a hashed index, specify hashed as the value of the index key, as in the following example to specify a hashed index on _id

db.collection.ensureIndex( { _id: “hashed” } )

MongoDB supports hashed indexes of any single field. The hashing function collapses sub-documents and computes the hash for the entire value, but does not support multi-key (i.e. arrays) indexes. You may not create compound indexes that have hashed index fields.

Build Indexes on Replica Sets – Background index creation operations become foreground indexing operations on secondary members of replica sets. The foreground index building process blocks all replication and read operations on the secondaries while they build the index. Secondaries will begin building indexes after the primary finishes building the index. In sharded clusters, the mongos will send ensureIndex() to the primary members of the replica set for each shard, which then replicate to the secondaries after the primary finishes building the index.

Considerations

  • Ensure that your oplog is large enough to permit the indexing or re-indexing operation to complete without falling too far behind to catch up.
  • This procedure does take one member out of the replica set at a time. However, this procedure will only affect one member of the set at a time rather than all secondaries at the same time.
  • Do not use this procedure when building a unique index with the dropDups option.

Procedure – If you need to build an index in a sharded cluster, repeat the following procedure for each replica set that provides each shard.

Stop One Secondary – Stop the mongod process on one secondary. Restart the mongod process without the –replSet option and running on a different port. This instance is now in “standalone” mode. For example, if your mongod normally runs with on the default port of 27017 with the –replSet option you would use the following invocation:

mongod –port 47017

Build the Index – Create the new index using the ensureIndex() in the mongo shell, or comparable method in your driver. This operation will create or rebuild the index on this mongod instance. For example, to create an ascending index on the username field of the records collection, use the following mongo shell operation

db.records.ensureIndex( { username: 1 } )

Restart the Program mongod – When the index build completes, start the mongod instance with the –replSet option on its usual port:

mongod –port 27017 –replSet rs0

Modify the port number (e.g. 27017) or the replica set name (e.g. rs0) as needed. Allow replication to catch up on this member.

Build Indexes on all Secondaries – For each secondary in the set, build an index according to the following steps

  • Stop One Secondary
  • Build the Index
  • Restart the Program mongod

Build the Index on the Primary – To build an index on the primary you can either

  • Build the index in the background on the primary.
  • Step down the primary using the rs.stepDown() method in the mongo shell to cause the current primary to become a secondary graceful and allow the set to elect another member as primary. Then repeat the index building procedure, listed below, to build the index on the primary
  • Stop One Secondary
  • Build the Index
  • Restart the Program mongod

Building the index on the background, takes longer than the foreground index build and results in a less compact index structure. Additionally, the background index build may impact write performance on the primary. However, building the index in the background allows the set to be continuously up for write operations during while MongoDB builds the index.

Build Indexes in the Background – By default, MongoDB builds indexes in the foreground, which prevents all read and write operations to the database while the index builds. Also, no operation that requires a read or write lock on all databases (e.g. listDatabases) can occur during a foreground index build. Background index construction allows read and write operations to continue while building the index.

Considerations – Background index builds take longer to complete and result in an index that is initially larger, or less compact, than an index built in the foreground. Over time, the compactness of indexes built in the background will approach foreground-built indexes. After MongoDB finishes building the index, background-built indexes are functionally identical to any other index.

Procedure – To create an index in the background, add the background argument to the ensureIndex() operation, as in the following index

db.collection.ensureIndex( { a: 1 }, { background: true } )

Build Old Style Indexes – Use this procedure only if you must have indexes that are compatible with a version of MongoDB earlier than 2.0. MongoDB version 2.0 introduced the {v:1} index format. MongoDB versions 2.0 and later support both the {v:1} format and the earlier {v:0} format. MongoDB versions prior to 2.0, however, support only the {v:0} format. If you need to roll back MongoDB to a version prior to 2.0, you must drop and re-create your indexes.

To build pre-2.0 indexes, use the dropIndexes() and ensureIndex() methods. You cannot simply reindex the collection. When you reindex on versions that only support {v:0} indexes, the v fields in the index definition still hold values of 1, even though the indexes would now use the {v:0} format. If you were to upgrade again to version 2.0 or later, these indexes would not work. As an example, suppose you rolled back from MongoDB 2.0 to MongoDB 1.8, and suppose you had the following index on the items collection:

{ “v” : 1, “key” : { “name” : 1 }, “ns” : “mydb.items”, “name” : “name_1” }

The v field tells you the index is a {v:1} index, which is incompatible with version 1.8. To drop the index, issue the following command

db.items.dropIndex( { name : 1 } )

To recreate the index as a {v:0} index, issue the following command

db.foo.ensureIndex( { name : 1 } , { v : 0 } )

Apply for MongoDB Certification Now!!

https://www.vskills.in/certification/databases/mongodb-server-administrator

Back to Tutorial

Share this post
[social_warfare]
Index Properties
Index Intersection

Get industry recognized certification – Contact us

keyboard_arrow_up