未加星标

MongoDB Aggregation Pipeline Patterns Part 1

字体大小 | |
[数据库(综合) 所属分类 数据库(综合) | 发布者 店小二05 | 时间 2016 | 作者 红领巾 ] 0人收藏点击收藏

Today we are going to explore different ways of transforming raw data efficiently. I'll be going over a couple of patterns that will make it easier for you to calculate everything you need in a single MongoDB aggregation pipeline query.

Consider the following data set for these examples:

// Sales Collection [ {sales: 10, week: 1, year: 2016, source: 'retail'}, {sales: 12, week: 2, year: 2016, source: 'retail'}, {sales: 14, week: 3, year: 2016, source: 'retail'}, {sales: 16, week: 4, year: 2016, source: 'retail'}, {sales: 1, week: 1, year: 2016, source: 'online'}, {sales: 2, week: 2, year: 2016, source: 'online'}, {sales: 4, week: 3, year: 2016, source: 'online'}, {sales: 6, week: 4, year: 2016, source: 'online'}, {sales: 9, week: 1, year: 2015, source: 'retail'}, {sales: 11, week: 2, year: 2015, source: 'retail'}, {sales: 13, week: 3, year: 2015, source: 'retail'}, {sales: 15, week: 4, year: 2015, source: 'retail'}, {sales: 0, week: 1, year: 2015, source: 'online'}, {sales: 1, week: 2, year: 2015, source: 'online'}, {sales: 2, week: 3, year: 2015, source: 'online'}, {sales: 3, week: 4, year: 2015, source: 'online'}, {sales: 10, week: 1, year: 2016, source: 'retail'}, {sales: 12, week: 2, year: 2016, source: 'retail'}, {sales: 14, week: 3, year: 2016, source: 'retail'}, {sales: 16, week: 4, year: 2016, source: 'retail'}, {sales: 1, week: 1, year: 2016, source: 'online'}, {sales: 2, week: 2, year: 2016, source: 'online'}, {sales: 4, week: 3, year: 2016, source: 'online'}, {sales: 6, week: 4, year: 2016, source: 'online'}, {sales: 9, week: 1, year: 2015, source: 'retail'}, {sales: 11, week: 2, year: 2015, source: 'retail'}, {sales: 13, week: 3, year: 2015, source: 'retail'}, {sales: 15, week: 4, year: 2015, source: 'retail'}, {sales: 0, week: 1, year: 2015, source: 'online'}, {sales: 1, week: 2, year: 2015, source: 'online'}, {sales: 2, week: 3, year: 2015, source: 'online'}, {sales: 3, week: 4, year: 2015, source: 'online'}, ] Split Concerns

This method is great for setting up data before doing larger calculations.

// Pipeline instructions const pipeline = [ {$group: { _id: {week: '$week', source: '$source'}, sales: {$sum: {$cond: [{$eq: ['$year', 2016]},'$sales',0]}}, salesLastYear: {$sum: {$cond: [{$eq: ['$year', 2015]},'$sales',0]}}, }}, ]

In this example we are grouping by week and source and doing a $sum of sales. We are combining last year's sales with this year's sales.

This will make your pipeline easier to understand. You should always wait until the last step to actually calculate the results you want to see. Everything you are doing before that is to set up the variables you need to make those calculations.

Unwind to Index

This method is great for indexing data and can be useful when order matters.

// Pipeline instructions const pipeline = [ {$group: { _id: {week: '$week', source: '$source'}, sales: {$sum: {$cond: [{$eq: ['$year', 2016]},'$sales',0]}}, salesLastYear: {$sum: {$cond: [{$eq: ['$year', 2015]},'$sales',0]}}, }}, {$sort: {'_id.week': 1}}, {$group: { _id: '$_id.source', d: {$push: '$$ROOT'}, }}, {$unwind: { path: '$d', includeArrayIndex: 'idx', }}, ]

Notice how we sort the result by week number. Then we group by source to push all of the weeks into an ordered array. Now that we have the data of each source in ordered weeks we $unwind but include an index by using includeArrayIndex . With this, we know the exact position of the week inside the array but have a normalized mongo object.

The result looks like this:


MongoDB Aggregation Pipeline Patterns Part 1
Group to Map and Filter

This method is great for remapping data.

Suppose we need historical data to be mapped into each week for each source. We would need to do this.

// Pipeline instructions const pipeline = [ {$group: { _id: {week: '$week', source: '$source'}, sales: {$sum: {$cond: [{$eq: ['$year', 2016]},'$sales',0]}}, salesLastYear: {$sum: {$cond: [{$eq: ['$year', 2015]},'$sales',0]}}, }}, {$sort: {'_id.week': 1}}, {$group: { _id: '$_id.source', d: {$push: '$$ROOT'}, }}, {$unwind: { path: '$d', includeArrayIndex: 'idx', }}, // New Part {$project: { _id: 1, idx: 1, week: '$d._id.week', sales: '$d.sales', salesLastYear: '$d.salesLastYear', }}, {$group: { _id: '$_id', data: {$push: '$$ROOT'}, }}, {$project: { _id: 1, data: { $map: { input: '$data', as: 'datum', in: { week: '$$datum.week', sales: '$$datum.sales', salesLastYear: '$$datum.salesLastYear', previousWeeks: {$filter: { input: '$data', as: 'filterDatum', cond: {$lt: ['$$filterDatum.idx', '$$datum.idx']}, }}, }, }, }, }}, {$unwind: '$data'}, ]

First we $project to create an object that's easier to work with. Next we $group all of that data into each of the sources so we can have it all together. Now we can use $map and $filter to look through each row of data and attach all of the previous weeks by using the index we calculated with the previous technique.

The $map function takes in 3 commands input , as , and in . input will set which array in the current document you want to map, as sets the name of each element in the array, and in is where the actual work happens, $map will return a new version of the previous array.

The $filter function takes in 3 commands as well input , as , and cond . input will set which array in the current document you want to filter, as sets the name of each element in the array, and cond is the condition that needs to return true to include in the new array. This function returns a smaller array of the previous array.

The result of this operation is this:


MongoDB Aggregation Pipeline Patterns Part 1

本文数据库(综合)相关术语:系统安全软件

主题: MongoDB
分页:12
转载请注明
本文标题:MongoDB Aggregation Pipeline Patterns Part 1
本站链接:http://www.codesec.net/view/483998.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 数据库(综合) | 评论(0) | 阅读(29)