Sunday, September 28, 2014

Getting started with MongoDb

Introduction


In this module, we will introduce you to mongodb, its features, advantages and why do we need it.

Overview
MongoDB is classified as a NoSQL database written in c++ and it is a cross-platform, document oriented database that provides, high performance, high availability and easy scalability. MongoDB works on concept of collection and document.

Database
Database is a physical container for collections. Each database gets its own set of files on the file system. A single MongoDB server typically has multiple databases.

Collection
Collection is a group of MongoDB documents. It is the equivalent of an RDBMS table. A collection exists within a single database. Collections do not enforce a schema. Documents within a collection can have different fields. Typically, all documents in a collection are of similar or related purpose.



Document
A document is a set of key-value pairs. Documents have dynamic schema. Dynamic schema means that documents in the same collection do not need to have the same set of fields or structure, and common fields in a collection's documents may hold different types of data.



RDBMS terminology with MongoDB


Licensing and support

MongoDB is available for free under the GNU Affero General Public License. The language drivers are available under an Apache License. In addition, MongoDB Inc. offers commercial licenses for MongoDB.

Downloads


To install the MongoDB on your OS, first download the latest release of MongoDB from http://www.mongodb.org/downloads Make sure you get correct version of MongoDB depending upon your OS version. Please follow my installation module to get installation process.

Advantages

Any relational database has a typical schema design that shows number of tables and the relationship between these tables. While in MongoDB there is no concept of relationship

Advantages of MongoDB over RDBMS
  • Schema less : MongoDB is document database in which one collection holds different different documents. 
    Number of fields, content and size of the document can be differ from one document to another.
  • Structure of a single object is clear
  • No complex joins
  • Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language 
    that's nearly as powerful as SQL
  • Tuning
  • Ease of scale-out: MongoDB is easy to scale
  • Conversion / mapping of application objects to database objects not needed
  • Uses internal memory for storing the (windowed) working set, enabling faster access of data

Features

Some of the main features include:
Document-oriented
Instead of taking a business subject and breaking it up into multiple relational structures, MongoDB can store the business subject in the minimal number of documents. For example, instead of storing title and author information in two distinct relational structures, title, author, and other title-related information can all be stored in a single document called Book, which is much more intuitive and usually easier to work with.
Ad hoc queries
MongoDB supports search by field, range queries, regular expression searches. Queries can return specific fields of documents and also include user-defined JavaScript functions.
Indexing
Any field in a MongoDB document can be indexed (indices in MongoDB are conceptually similar to those in RDBMSes). Secondary indices are also available.
Replication
MongoDB provides high availability with replica sets. A replica set consists of two or more copies of the data. Each replica set member may act in the role of primary or secondary replica at any time. The primary replica performs all writes and reads by default. Secondary replicas maintain a copy of the data on the primary using built-in replication. When a primary replica fails, the replica set automatically conducts an election process to determine which secondary should become the primary. Secondaries can also perform read operations, but the data is eventually consistent by default.
Load balancing
MongoDB scales horizontally using sharding. The user chooses a shard key, which determines how the data in a collection will be distributed. The data is split into ranges (based on the shard key) and distributed across multiple shards. (A shard is a master with one or more slaves.)
MongoDB can run over multiple servers, balancing the load and/or duplicating data to keep the system up and running in case of hardware failure. Automatic configuration is easy to deploy, and new machines can be added to a running database.
File storage
MongoDB can be used as a file system, taking advantage of load balancing and data replication features over multiple machines for storing files.
This function, called GridFS, is included with MongoDB drivers and available with no difficulty for development languages . MongoDB exposes functions for file manipulation and content to developers. GridFS is used, for example, in plugins for NGINX and lighttpd. Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, and stores each of those chunks as a separate document.
In a multi-machine MongoDB system, files can be distributed and copied multiple times between machines transparently, thus effectively creating a load-balanced and fault-tolerant system.
Aggregation
MapReduce can be used for batch processing of data and aggregation operations. The aggregation framework enables users to obtain the kind of results for which the SQLGROUP BY clause is used.
Server-side JavaScript execution
JavaScript can be used in queries, aggregation functions (such as MapReduce), and sent directly to the database to be executed.
Capped collections
MongoDB supports fixed-size collections called capped collections. This type of collection maintains insertion order and, once the specified size has been reached, behaves like acircular queue.

Where should we use MongoDB?

Mongodb features:
  • Big Data
  • Content Management and Delivery
  • Mobile and Social Infrastructure
  • User Data Management
  • Data Hub

In next modules, we will be having a look for how to use it in our applications for CRUD operations.


No comments:

Post a Comment