Hardware Provisioning for MongoDB
Some of the most common questions we hear from users relate to capacity planning and hardware choices. How many replicas do I need? Should I consider sharding right away? How much RAM will I need for my working set? SSD or HDD? No one likes spending a lot of cash on hardware and cloud bills can just be as painful. MongoDB is different from traditional RDBMSs in its resource management, so you need to be mindful when deciding on the cluster layout and hardware. In this talk we will review the factors that drive the capacity requirements: volume of queries, access patterns, indexing, working set size, among others. Attendees will gain additional insight as we go through a few real-world scenarios, as experienced with MongoDB Inc customers, and come up with their ideal cluster layout and hardware.
Senior Solution Architect at MongoDB
Chad Tindel is a Senior Solution Architect at MongoDB where he specializes in helping customers understand and use the nosql product to solve complex business problems. Previously, Chad was a Solution Architect at Cloudera focusing on the Hadoop space and was also a Solution Architect at Red Hat, helping customers build out their enterprise Linux infrastructures. He holds a BS in Computer Science from California Polytechnic in San Luis Obispo as well as an MS in Finance from the University of Denver.
There is not a lot of information out there on sizing MongoDB. This session, even though the last session was well attended.
How do you size?
Often customers over or under engineer.
Think the scenario where your app gets listed and suddenly lots of sign ups. The server gets beaten and needs to be re-sized.
Requirements – Step 1
What are the business requirements
- Uptime (do you need more than one Data Center
- Acceptable latency – especially during peak times.
- Requirements can change over time
- More users, more data, new indexes
- More writes
- Collect metrics!
- Adjust configuration incrementally
- Plan ahead
Try to avoid a crisis.
Do a Proof of Concept
- Start small on a single node
- Design your schema (Read and write applications are different)
- Understand query patterns
- Get a handle on working set (the active data)
Then add replication to see impact
Review Requirements as result of POC
- Data sizes (Number of documents, Average document size, size of data on disk, size of indexes, expected growth, document model)
- Ingestion – Throughput / Updates / Deletes per second peak and average
- Bulk inserts? How large and How often?
Do you have SLAs on this performance?
- Performance expectations
- Life of data
- Security requirements (SSL, Encryption at rest)
- Number of data centers in use (Active/Active , Active/Passive Cross Data Center latency)
IOPS (4K in size)
Data and loading patterns.
CPU tends to be less important
Fast storage and as much RAM as you can.
Network latency affects replication lag
7200 RPM SATA = 75-100 IOPS
15000 SAS = 175-210 IOPS
Amazon SSD EBS = 4,000 PIOPS / Volume
48,000 PIOPS / Instance
Intel X-25-E SLC = 5,000 IOPS
Use IOSTAT to monitor disk performance (or MongoPerf).
Release 2.4 added a feature to estimate the size of a working set.
Latency impacts WriteConcern time and ReadPreference
Throughput impacts Update and Write Patterns and Read/Queries
Use Netperf to measure network performance.
Only really comes in to play when using queries without indexes which mean performing a table scan.
or for Sorting within a Shard and MergeSorted when aggregated.
Aggregation Framework or MapReduce require CPU Performance.
Case Study – Spanish Bank:
- 6 months of logs held for 6 months
- 18TB at 3TB/Month
3 Nodes / shard * 36 Shards = 108 Physical Machines
128GB/RAM * 36 = 4.6TB RAM
3 Config servers (virtual machines)
- moving Product catalog from SQL Server to MongoDB as an overhaul to Open Source
- 2 Main Data Centers active/active
- Cyber Monday peaks at 214 Requests/Sec. Budget for 400 Requests/Sec for headroom.
- Heavy Read process orientation.
- 4 M product SKU’s with JSON document size of 30KB
- Requests for specific product (by _id)
- Products by Category (Return 72 documents – or 200 if a google bot)
- Partition (Shard) by Category.
- Products in multiple categories are duplicated means on average doc is in 2 categories so store 4M SKUs x 2 = 8M
8M docs * 30K want everything in memory. 384GB RAM/Server
Sharding adds a layer of complexity (eg. Add config server) so don’t shard unless you need to.
Determined a 4 Node Replica set 2 in each Data Center. Plus an Arbiter.
Recommended a Single Replica Set
– 4 Node Replica
But customer found they could only deploy on 64G RAM. So they deployed 3 shards 4 nodes each + Arbiter.
Arbiters are small. They just exist for voting. Can be a small 1VCPU with 4GB RAM.
[tag health cloud BigData MongoDB MongoDBWorld NoSQL]
Health & Cloud Technology Consultant
Mark is available for challenging assignments at the intersection of Health and Technology using Big Data, Mobile and Cloud Technologies. If you need help to move, or create, your health applications in the cloud let’s talk.
Stay up-to-date: Twitter @ekivemark
Disclosure: I began as a Patient Engagement Advisor and am now CTO to Personiform, Inc. and their Medyear.com platform. Medyear is a powerful free tool that helps you collect, organize and securely share health information, however you want. Manage your own health records today. Medyear: The Power Grid for your Health.