Site icon Tutorial

Sharded Cluster Deployment

Sharded Cluster Deployment

The following sections provide information on deploying sharded clusters.

Deploy a Sharded Cluster – Use the following sequence of tasks to deploy a sharded cluster:

Start the Config Server Database Instances – The config server processes are mongod instances that store the cluster’s metadata. You designate a mongod as a config server using the –configsvr option. Each config server stores a complete copy of the cluster’s metadata. In production deployments, you must deploy exactly three config server instances, each running on different servers to assure good uptime and data safety. In test environments, you can run all three instances on a single server.

mongod –configsvr –dbpath <path> –port <port>

The default port for config servers is 27019. You can specify a different port. The following example starts a config server using the default port and default data directory:

mongod –configsvr –dbpath /data/configdb –port 27019

All config servers must be running and available when you first initiate a sharded cluster.

Start the mongos Instances – The mongos instances are lightweight and do not require data directories. You can run a mongos instance on a system that runs other cluster components, such as on an application server or a server running a mongod process. By default, a mongos instance runs on port 27017. When you start the mongos instance, specify the hostnames of the three config servers, either in the configuration file or as command line parameters.

To avoid downtime, give each config server a logical DNS name (unrelated to the server’s physical or virtual hostname). Without logical DNS names, moving or renaming a config server requires shutting down every mongod and mongos instance in the sharded cluster. To start a mongos instance, issue a command using the following syntax:

mongos –configdb <config server hostnames>

For example, to start a mongos that connects to config server instance running on the following hosts and on the default ports:

cfg0.example.net

cfg1.example.net

cfg2.example.net

You would issue the following command:

mongos –configdb cfg0.example.net:27019,cfg1.example.net:27019,cfg2.example.net:27019

Each mongos in a sharded cluster must use the same configDB string, with identical host names listed in identical order. If you start a mongos instance with a string that does not exactly match the string used by the other mongos instances in the cluster, the mongos return a Config Database String Error error and refuse to start.

Add Shards to the Cluster – A shard can be a standalone mongod or a replica set. In a production environment, each shard should be a replica set.

mongo –host <hostname of machine running mongos> –port <port mongos listens on>

The following are examples of adding a shard with sh.addShard():

sh.addShard( “rs1/mongodb0.example.net:27017” )

sh.addShard( “rs1/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017” )

sh.addShard( “mongodb0.example.net:27017” )

Enable Sharding for a Database – Before you can shard a collection, you must enable sharding for the collection’s database. Enabling sharding for a database does not redistribute data but make it possible to shard the collections in that database. Once you enable sharding for a database, MongoDB assigns a primary shard for that database where MongoDB stores all data before sharding begins.

mongo –host <hostname of machine running mongos> –port <port mongos listens on>

sh.enableSharding(“<database>”)

Optionally, you can enable sharding for a database using the enableSharding command, which uses the following syntax:

db.runCommand( { enableSharding: <database> } )

Enable Sharding for a Collection – You enable sharding on a per-collection basis.

sh.shardCollection(“<database>.<collection>”, shard-key-pattern)

Replace the <database>.<collection> string with the full namespace of your database, which consists of the name of your database, a dot (e.g. .), and the full name of the collection. The shard-key-pattern represents your shard key, which you specify in the same form as you would an index key pattern. As an example, the following sequence of commands shards four collections:

sh.shardCollection(“records.people”, { “zipcode”: 1, “name”: 1 } )

sh.shardCollection(“people.addresses”, { “state”: 1, “_id”: 1 } )

sh.shardCollection(“assets.chairs”, { “type”: 1, “_id”: 1 } )

db.alerts.ensureIndex( { _id : “hashed” } )

sh.shardCollection(“events.alerts”, { “_id”: “hashed” } )

In order, these operations shard:

This shard key distributes documents by a hash of the value of the _id field. MongoDB computes the hash of the _id field for the hashed index, which should provide an even distribution of documents across a cluster.

Shard a Collection Using a Hashed Shard Key – It is new in version 2.4. Hashed shard keys use a hashed index of a field as the shard key to partition data across your sharded cluster. If chunk migrations are in progress while creating a hashed shard key collection, the initial chunk distribution may be uneven until the balancer automatically balances the collection.

Shard the Collection – To shard a collection using a hashed shard key, use an operation in the mongo that resembles the following:

sh.shardCollection( “records.active”, { a: “hashed” } )

This operation shards the active collection in the records database, using a hash of the a field as the shard key.

Specify the Initial Number of Chunks – If you shard an empty collection using a hashed shard key, MongoDB automatically creates and migrates empty chunks so that each shard has two chunks. To control how many chunks MongoDB creates when sharding the collection, use shardCollection with the numInitialChunks parameter. MongoDB 2.4 adds support for hashed shard keys. After sharding a collection with a hashed shard key, you must use the MongoDB 2.4 or higher mongos and mongod instances in your sharded cluster.

MongoDB hashed indexes truncate floating point numbers to 64-bit integers before hashing. For example, a hashed index would store the same value for a field that held a value of 2.3, 2.2, and 2.9. To prevent collisions, do not use a hashed index for floating point numbers that cannot be reliably converted to 64-bit integers (and then back to floating point). MongoDB hashed indexes do not support floating point values larger than 253.

Add Shards to a Cluster – You add shards to a sharded cluster after you create the cluster or anytime that you need to add capacity to the cluster. When adding a shard to a cluster, you should always ensure that the cluster has enough capacity to support the migration without affecting legitimate production traffic. In production environments, all shards should be replica sets.

Add a Shard to a Cluster – You interact with a sharded cluster by connecting to a mongos instance.

mongo –host mongos0.example.net –port 27017

You can instead use the addShard database command, which lets you specify a name and maximum size for the shard. If you do not specify these, MongoDB automatically assigns a same and maximum size. The following are examples of adding a shard with sh.addShard():

sh.addShard( “rs1/mongodb0.example.net:27017” )

For MongoDB versions prior to 2.0.3, you must specify all members of the replica set. For example:

sh.addShard( “rs1/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017” )

sh.addShard( “mongodb0.example.net:27017” )

It might take some time for chunks to migrate to the new shard.

Apply for MongoDB Certification Now!!

https://www.vskills.in/certification/databases/mongodb-server-administrator

Back to Tutorial

Exit mobile version