Learn How to Store & Retrieve Large Files in MongoDB With GridFS

0
3011
Retrieve Large Files

In this chapter, we are going to learn about the GridFS specification. GridFS is a way through which we can store and retrieve large files such as audio files, video files, images, etc. in MongoDB. As the name suggests “GridFS”, is a file system that enables us to store files where the actual data is stored inside the MongoDB’s collection. GridFS is capable enough to store a very large sized file way beyond its permitted document size limit of 16MB. This is because, it splits a large sized file into chunks and stores each chunk of data as a separate document, which can be a maximum size of 255k.

GridFS Storage Fashion
Internally in MongoDB, GridFS uses two collections. These collections are known as fs.files and fs.chunks collections. They store the file’s metadata and the chunks respectively. In fs.chunks, MongoDB uses _id ObjectId field to identify each chunk uniquely. In this case, the fs.files serves as a parent document that is linked to the fs.chunks document through the files_id field.

Example: –
The following is an example of a sample document of fs.files collection −

fs.files collection

{

“filename” : “eduonix.txt”,

“chunkSize” : NumberInt ( 981520 ),

“uploadDate” : ISODate ( “2017-10-09T10:37:23.367Z” ),

“md5” : “9b765439321e147545c07f72d54cca4e”,

“length” : NumberInt ( 786 )

}

In the above document, it specifies the file name, chunk size, uploaded date, md5 and length of the file. The following is an example of a sample document of fs.chunks document –

fs.chunks collection

{

“files_id” : ObjectId (“856a75e19f54cfee9a2fe78c”),

“n” : NumberInt(1),

“data” : “Binary Data for MongoDB”

}

How to add files to GridFS?
Let us try to store a mp3 file with the help of GridFS by using the put command. We have to use the mongofiles.exe utility, which should be available in the MongoDB’s bin folder (refer your MongoDB installation path on your machine) in order to make it work. Next, open the windows command prompt and navigate to the /bin folder where mongofiles.exe module is present and enter the following command.

add files to GridFS

Command

  • mongofiles.exe –d gridfs_demo put myfavouritesong.mp3;

In this case, the database name is gridfs_demo, where the mp3 file will be stored actually. When the specified database is not present in the MongoDB, then in that case the MongoDB will create a new document on the fly. Post storage of the file in database, we can retrieve the details of the file’s document from the database with the help of the following query.

Command

  • db.fs.files.find ();

The above command after its execution will return the following details for the file’s document.

Output

{

_id: ObjectId ( ‘534a811bf8b4aa4d33fdf94d’ ),

filename: ” myfavouritesong.mp3″,

chunkSize: 3456785,

uploadDate: new Date ( 1597394567874 ),

md5: “d5c12345c909g7bed2d9c435e34d1d43”,

length: 30402848

}

Next, we can also check if the chunks are present in fs.chunks collection or not which are related to the above stored file with the help of the following command. Here, we need to use the document id which was returned during our last database query.

Command

  • db.fs.chunks.find ( { files_id : ObjectId (‘534a811bf8b4aa4d33fdf94d’ ) } )

After successful execution of the above command, you can observe that the large mp3 file has been broken down into numerous chunks which means the actual mp3 file is stored as these many documents into chunks in the MongoDB database.

Conclusion: –
In this chapter, we have discussed about GridFS specification of MongoDB which splits the large files such as audio files, video files, images, etc. into small chunks and store them as documents in collection into the MongoDB. Also in this chapter, we have covered the behaviour of GridFS in the MongoDB with the help of suitable examples.

LEAVE A REPLY

Please enter your comment!
Please enter your name here