未加星标

Get the most out of Azure's global DocumentDB

字体大小 | |
[数据库(综合) 所属分类 数据库(综合) | 发布者 店小二03 | 时间 2017 | 作者 红领巾 ] 0人收藏点击收藏

Get the most out of Azure's global DocumentDB

Satya Nadella’s Microsoft is often described as a “new Microsoft,” and if any one part of the company embodies that description, it’s the team that’s building the Microsoft Azure cloud platform. Azure’s collection of services isn’t designed to support windows developers or Windows applications alone; it’s built on open APIs and standards that open it to anyone wanting to build a cloud-hosted or cloud-serviced application.

One of the more important parts of Azure is perhaps its least known: DocumentDB. Designed to be a global-scale NoSQL database, it’s the back end for many of Microsoft’s own services. There’s a lot to like in DocumentDB, from its MongoDB-compatible APIs to its innovative approach to consistency when working across multiple datacenters. It’s also low-latency, running on Azure’s SSD storage.

DocumentDB: A database for a distributed world

Concurrency and consistency are the classic problems for anyone building distributed systems. How can you guarantee that all your users see the same view of the data, or at least see the data they need to use right now, when data is being written by many thousands of users in regions that can be continents apart?

That’s where the two most commonly used consistency models come in to play: strong and eventual.

Strong consistency means that your applications wait until all data is replicated between each instance of your database, ensuring that everyone accessing the data gets the same view―but preventing new writes until that consistent view is achieved. It also means your data needs to stay in the same Azure geographical region. Eventual consistency is a much more relaxed approach, in which users get access to the current state of their local database instance―so there’s no guaranteed level of consistency. It’s fast, but your queries may not give you the latest data.

DocumentDB introduces two new consistency models that have proven to be very popular with developers building apps that use the service.

The first option, bounded staleness, lets you define two alternate approaches to consistency: that you’re guaranteed to be within a specified number of versions of the data by which reads lag writes, or that you can set a time limit by which all the data is consistent. You can choose, for example, to ensure your data is always consistent after 20 seconds or you’re at most only two versions of a document behind the last write. As you’re putting boundaries on how fast DocumentDB replicates data between instances, there’s no limit on the number of Azure geographic regions you can use for your data.

The second option is session consistency, where consistency is linked to a client session. That’s a relevant option where you’re employing a cloud database as a back end to handle data, with data replicated across instances and regions while a user is connected to only one instance. The result is a database that responds quickly to both reads and writes. But this approach requires you to think carefully about what data you’re storing in it.

You can even mix and match consistency approaches using different DocumentDB instances for different parts of an application. User and session data, where write and read speed are important, can be handled by session consistency, while other aspects of an application where writes are less critical and you’re looking for fast reads can be handled via bounded staleness.

The realities of working with DocumentDB

Microsoft has designed DocumentDB to be elastic, able to scale up and down when adding new databases and collections to an account. Each database you create is made up of collections of JSON-formatted data, as well as a list of users. It’s certainly highly scalable Microsoft itself has databases that have thousands of collections and contain terabytes of data.

One note: Although Microsoft talks about databases containing users, DocumentDB has a different concept of users from the rest of us. Instead of mapping to an individual account, a DocumentDB user is more abstract, perhaps best thought of as a way of naming access-control policies, so you manage what applications and services have access to what data and how they can use it.

Containers can contain, well, anything. DocumentDB is intended to be schema-free, and content is automatically indexed as you add it to a collection. It’s an approach that makes it harder to query your data because you need to know what you’re looking for. That’s why, in practice, you’re likely to use the common NoSQL key/value pattern for your data, giving you the tools you need to build queries.

Queries can be made using DocumentDB’s own SQL dialect, as well as by embedding javascript functions inside your database. Embedded functions can handle hierarchical queries across all the collections in a database, as well as more complex query types that take advantage of the scale and relatively flat structure of a DocumentDB store. The whole approach remains transactional, and embedded queries will simply abort if there are any exceptions.

Like much of Azure, DocumentDB is a pay-as-you-go service, so you’ll need to keep a close watch on what you’re using in terms of storage and bandwidth. (Microsoft offers a mix of pricing options, for scalable databases and for fixed sizes with fixed performance levels.) There’s a cloud sandbox to help you design and build queries, and there’s a local emulator you can use to design your database before deploying anything on Azure. But don’t think about using the emulator in production: It’s not designed to scale and will handle only a few containers.

Applications communicate with the service over a set of REST APIs, which you can call directly. In practice, though, you’re much more likely to use one of Microsoft’s SDKs, which include .Net and Node.js, as well as Java and python. It’s a lot easier building and delivering JSON documents via an SDK than assembling them by hand and handling asynchronous calls and responses directly. There’s sample code on GitHub to help you get started.

A DocumentDB document is really a JSON blob

The biggest problem Microsoft has with DocumentDB is its name. Most of us aren’t familiar with the concept of blobs of JSON-formatted data being called “documents,” when they can contain anything you can wrap in a binary-encoded format, then deliver via REST to an API. Calling it DocumentDB makes it seem that all you can store in it are Office files and raw text files, when DocumentDB is really the flexible back end you need to build a modern born-in-the-cloud application.

DocumentDB has often been described within Microsoft as a “planetary-scale database.” As we build more and more software in the cloud, using more and more distributed application concepts, a planet-scale database makes a lot of sense. All we need to do is design our applications to take advantage of it, thinking about how we prioritize both reads and writes across our distributed code.

本文数据库(综合)相关术语:系统安全软件

主题: SQLRESTWindowsJavaGitGitHubMongoDBJavaScriptNode.js.Net
分页:12
转载请注明
本文标题:Get the most out of Azure's global DocumentDB
本站链接:http://www.codesec.net/view/520262.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 数据库(综合) | 评论(0) | 阅读(41)