Tag Archives: MongoDB

#MongoDBWorld Innovation Awards

Analytics: eBay, Genentech

Big Apple for New York Companies: SumAll – aggregating data at scale on MongoDB on ObjectRocket at Rackspace

Cool Data: UK Met Office Space Weather Project

Data Science: eHarmony generating more than 3B potential matches per day. MongoDB helped reduce time to match to minutes.

Gaming: EA runs FIFA on MongoDB

Education: LinkedIn and their internal LearnIn platform (see my earlier post)

Internet of Things: Bosch

MongoDB + Hadoop: United Health Group – Optum Insight (see my earlier post

Open Source: 3D Repo

Scale: Adobe, Lockheed Martin

Startup: Twine (Health)

Tools: JSON Studio, Meteor

[tag health cloud BigData MongoDB MongoDBWorld NoSQL]

Mark Scrimshire
Health & Cloud Technology Consultant

Mark is available for challenging assignments at the intersection of Health and Technology using Big Data, Mobile and Cloud Technologies. If you need help to move, or create, your health applications in the cloud let’s talk.
Blog: http://blog.ekivemark.com
email: mark@ekivemark.com
Stay up-to-date: Twitter @ekivemark
Disclosure: I began as a Patient Engagement Advisor and am now CTO to Personiform, Inc. and their Medyear.com platform. Medyear is a powerful free tool that helps you collect, organize and securely share health information, however you want. Manage your own health records today. Medyear: The Power Grid for your Health.

#MongoDBWorld IBM / Cloudant – Adam K

http://world.mongodb.com/content/keynote-dr-angel-luis-diaz-ibm

Remarks from Dr. Angel Luis Diaz, VP Open Technology and Cloud Performance Solutions, IBM.

What is IBM doing to push innovation.

Cloudant has been committing to CouchDB and MongoDB.

IBM BlueMix – Platform as a Service built on Cloud Foundry.

It is fascinating to see the evolution of IBM. No longer hardware. Services are the future.

Cloudant JSON doc store delivered as DB as a Service (DBaaS).
Cloudant Query – Implements MongoDB style DB find.

Cloudant Query is taking MongoDB style syntax and delivering via http.

MongoDB style querying is becoming a de facto standard.

More common interfaces can help NoSQL like SQL did when relational databases where introduced.

#MongoDBworld CharityMajors (@mipsytipsy) @Parse closing keynote

Charity Majors @PArse / Facebook

http://world.mongodb.com/content/keynote-charity-majors-parsefacebook

Parse hands the backend – Push notifications, Analytics and a ton of other server-side services and deliver at scale.

Parse has 270,000 mobile apps running on Parse – all hosted on MongoDB.

All Software is a pain!

MongoDB + Ops

Reliability

Reliability – MongoDB is not immune to crashes. The key to resiliency is the Replica Set. You only have to be concerned about the service.

Horizontally scalable services means no Pets. You are dealing with a herd. Cattle not pets. No hand crafted server pets because pets will always die.

Charity – A Battlestar Geek – Your life is without meaning without BG!

Design for High Availability from the outset.

You can’t design in high availability AFTER the fact.

Ops people hate software because they have to plan for failure. It is going to happen.

Flexibility

When you change a schema EVERYTHING break! Why do you want a schema????

Data model Flexibility is critical

Workload flexibility is also critical to flexibility.

When you have hundreds of thousands of apps you have everything. You can’t optimize for a specific load.

Every App must be performant AND must be able to scale.

ONE re-usable solution is better than multiple platforms optimized for specific systems. Engineering workload is a limiting factor.

Choose ONE SINGLE Reusable solution.

Automation

Make repetitive annoying tasks made easy.

Scalability is about more than handling tasks really quickly.

The replica set allows you to take out nodes and work on them.

Parse is dealing with 100’s of Terabytes every month.

MongoDB works for Parse:
– Flexible
– Resilient
– Automation friendly

Automation needs operations best practices to be shared. Operations is still young.
Parse has published open source tools

Parse launched these tools today:
– Mongo Proxy github.com/facebookgo/dvara

Allows the replay of workload profiles. Replay in line with original snapshots, or as fast as possible.

Both tools are written in GO.

#MongoDBWorld Hidden gems in the new 2.6 version of @mongoDB

More from #MongoDBWorld.

Hidden Gems in the 2.6 Release

Everyone using MongoDB is familiar with the big features of the 2.6 release (and if you’re not, here’s a link) — text search, $out, user-defined roles, X509 authentication, etc. But what about the little guys? Our VP of Engineering, Daniel Pasette, will take you on a tour of five small but mighty features from the 2.6 release that make your MongoDB experience more productive.

Dan Pasette

VP of Core Engineering at MongoDB

Dan is the VP of Core Engineering at MongoDB. Prior to joining MongoDB, Dan was a Development Manager at LimeWire where he led a team working on content ingestion for an (unreleased) digital music service called Grapevine. Past employment includes MTV Networks, Sonicnet, iXL, and Electronic Book Technologies. Dan holds a degree in Computer Science from Brown University.

http://world.mongodb.com/mongodb-world/session/hidden-gems-26-release

The Technical sessions are packed. I was hoping to look at Memory Management but the room was full to overflowing. So I dropped in to the session on the latest release of MongoDB – Version 2.6.

Power of 2 – Now default allocation Strategy

Power of 2 feature allows extra space when saving records. It is on by default in the latest release. It is best suited to uses that have re-writes to databases. What typically happens is a re-write expands the file and the file wouldn’t fit in the existing space. The extra space enabled by Power of 2 makes it more likely that records can be written back to the blocks they came from.

By adding space to records it reduces the amount of data movement because as data grows inside records the records still fit.

Server Side Timeouts

An example, a collection was indexed in staging but forgotten in production. This can cause table scans that cause users to re-try or re-scan. This creates socket timeouts. This can impact other users on the system. The new feature is maxTimeMS. This allows you to set a maximum time for how long an operation can run in the database. Set from milliseconds to minutes depending on the operation.

Query Engine Introspection

This works in conjunction with MaxTimeMS. It allows you to delve in to queries to resolve problems. The Query execution framework was completely re-writtin in 2.6. Prior to 2.6 the query path etc was opaque to users. This changed in 2.6.

The Query Planner chooses the best index for a given query.

Query Parser sends to Query Planner. This is passed to the Plan Cache. which passes to the Plan Runner.

The Plan Enumerator passes all the plans to the Multiplan router. This runs these plans for a limited amount of time and then chooses the most efficient.

On subsequent execution of the same query the query goes straight to the Plan Cache.

If the plan caches a sub-optimal plan.
Plans are dropped after indexing and other major changes.

getPlanCache

A set of Plan Cache tools to view and manipulate the cache.

Background indexing on Secondaries

This has existed but the feature has been rounded out.

Pre-2.6 background index builds became foreground index builds when replicated to secondaries.

In 2.6 keeps background indexing in the background.
Note: Background indexing isn’t as fast and is less tightly packed.

User Driven Enhancements

All of these features came about as a result of user feedback that go through jira.mongodb.com

Limits on Replica sets

Limit of 12 nodes in a replica set with 7 voting members

[tag cloud BigData MongoDBWorld

<

div style=”color: rgb(0, 0, 0); font-family: Arial; font-size: medium;”>

Mark Scrimshire
Health & Cloud Technology Consultant

Mark is available for challenging assignments at the intersection of Health and Technology using Big Data, Mobile and Cloud Technologies. If you need help to move, or create, your health applications in the cloud let’s talk.
Blog: http://blog.ekivemark.com
email: mark@ekivemark.com
Stay up-to-date: Twitter @ekivemark
Disclosure: I began as a Patient Engagement Advisor and am now CTO to Personiform, Inc. and their Medyear.com platform. Medyear is a powerful free tool that helps you collect, organize and securely share health information, however you want. Manage your own health records today. Medyear: The Power Grid for your Health.