未加星标

Map/Reducing the Pain of Dealing With Arrays

字体大小 | |
[前端(javascript) 所属分类 前端(javascript) | 发布者 店小二03 | 时间 2017 | 作者 红领巾 ] 0人收藏点击收藏

Map/Reducing the Pain of Dealing With Arrays
This is a rather meaty post that will hopefully shed some light on concepts that have been misunderstood and under-utilized, in the general community. The goal of this post is to explore the concepts behind map and reduce , and to illustrate how they can simplify algorithms for dealing with arrays of data. TL;DR

If you are a Map/Reduce pro in javascript, you know how they each work under the covers, and you’ve been using them to write highly-testable solutions to real-world problems, you’re likely good to skip this article, given two caveats:

You understand that map hasnothing to do with looping, except that it is usually used as an array method, and we can easily use other mappable objects

Regardless, you are likely going to want to check out thereal-ish world examples.

What is Map?

Well, let’s look at some code.

What if we had some array of numbers, and we wanted to increase each value by 1?

let numbers = [1, 2, 3, 4]; function addOneToEach (numbers) { for (let i = 0; i < numbers.length; i += 1) { const number = numbers[i]; numbers[i] = number + 1; } }

There’s a problem with this, though.

The sort of function that mutates external references is very hard to reason with. As soon as this type of code is running in someone else’s module on an array that you own, your understanding of your own code goes out the window.

We learned the hard way that global variables are a terrible idea. Having variables that can change value on you, unexpectedly, because someone else’s code changed them, makes it very hard to track down bugs. It’s even harder to test that there aren’t any. It’s safe to say that the local version of the same problem (mutating local members/variables), is just as bad; perhaps worse, because with globals at least you are assuming that you’re working with something that might backfire.

Typically, the way to solve that problem is to return a copy that is modified, rather than modifying the original array.

function addOneToEach (numbers) { let newNumbers = []; for (let i = 0; i < numbers.length; i += 1) { const number = numbers[i]; newNumbers.push(number + 1); } return newNumbers; }

ES6 also brought us the ability to write terse iterations over iterable primitives.

function addOneToEach (numbers) { let newNumbers = []; for (let number of numbers) { newNumbers.push(number + 1); } return newNumbers; }

In each of these cases, we’re still writing a bunch of boilerplates.

In the case of any of the looping code, there is really only one part of the whole algorithm that actually concerns itself with the " addOne " part.

That’s the number + 1 sub-expression, regardless of whether we’re pushing the result, or assigning it, the only thing of business value in that algorithm is that expression.

What if we wanted a similar algorithm to double each element?

function addOneToEach (numbers) { let newNumbers = []; for (let number of numbers) { newNumbers.push(number + 1); } return newNumbers; } function doubleEach (numbers) { let newNumbers = []; for (let number of numbers) { newNumbers.push(number * 2); } return newNumbers; }

That’s not a lot of code on its own, but soon enough, you're going to have a file full of loops, and very few lines which provide any real value.

They say that if you write the same thing three times, you should refactor it .

How many loops have you written in your life?

What would we refactor it into?

What if we could make a higher-order function that did our looping for us, and let us run just the custom part of the code that we care about?

function loopOverAndTransformEach (transform, array) { const newArray = []; for (let el of array) { newArray.push(transform(el)); } return newArray; }

We don’t have to write another loop ever again, for the sake of changing values in arrays.

Putting this to use, we can solve a lot of problems.

const incremented = loopOverAndTransformEach(x => x + 1, [1, 2, 3]); // [2, 3, 4] const employees = loopOverAndTransformEach(convertPersonToEmployee, people);

The code here is much more expressive.

These lines are practically self-documenting.

Even better, we could return another function that expects the array in a separate function call.

const addOneToEach = loopOverAndTransformEach(x => x + 1); addOneToEach([1, 2, 3]); // [2, 3, 4] addOneToEach([2, 3, 4]); // [3, 4, 5] const convertPeopleToEmployees = loopOverAndTransformEach(convertPersonToEmployee); convertPeopleToEmployees(somePeople); // someEmployees convertPeopleToEmployees(otherPeople); // otherEmployees

Great news; you have been using map this whole time!

We call it map because it’s about one value mapping to another.

Mapping is Not a]About Loops

Have you seen the detective shows where some encoded message is using the most simple cypher ever?

|Code|Letter| |:-:|:-:| | 01 | A | | 02 | B | | 03 | C | | ...|...| | 26 | Z |

Each number maps onto its corresponding letter. In our world, the function defines the steps to get from the value in the A column to the value in the B column.

const convertPeopleToEmployees = map(convertPersonToEmployee); convertPeopleToEmployees(specificPeople); // specificEmployees

An important consideration here is that in JS we almost always talk about map in the context of arrays.

But it’s not really a way of dealing with arrays. It’s about mapping one value into a new value, and getting the same type back.

So with ES6’ arrow function, we can call [1, 2, 3].map(x => x + 1); (the map method has been around since ES5) and we get a new array back.

In certain Promise libraries, they give you a map method. It gives you a new Promise, with the value transformed. In Observable libraries, map gives you a new Observable, with its values transformed.

So the takeaway is that you don’t have to care if map loops or not, or how it loops, or in some languages how many threads it uses. If you are using a map method on an array, or on a library Promise or Observable, or anything else, it will do what it has to do to transform the value(s), and give you back the same type, so that you can keep calling map on the result.

Our function isn’t good enough to deal with many other mappable types (because it manually loops), but that’s as easy as having the function default to calling map on whatever it’s passed.

function map (transform, mappable) { return mappable.map(transform); }

Problem solved. We can pass anything that has a similar map method through, and it will just work. You could have your own object with its own map behaviour, and it would be compatible.

In fact, we can create our own Mappable objects. We'll just make a constructor that takes a value, and returns an object with its own map method.

Remember that what gets returned from the map needs to be a new Mappable object (so that you can keep chaining calls to map ).

function Mappable (x) { return { value: x, // return a new mappable, filled with f(x) map: transform => Mappable(transform(x)) }; } Mappable(10) .map(x => x * 2) // Mappable(20) .map(y => y + 1) // Mappable(21) .map(z => z / 3) // Mappable(7) .value; // 7 const mappable1 = Mappable("hi"); const mappable2 = map(str => str.toUpperCase(), mappable1); mappable2.value; // "HI" const person = Mappable(personData); convertPeopleToEmployees(person); // Mappable(employee) What About Reduce?

We’ve covered a lot of ground with map .

A lot of the same thinking applies to reduce .

What reduce actually represents is a means of summarizing data as a single value.

const numbers = [1, 2, 3, 4]; const initialValue = 0; const getSum = (total, number) => total + number; const sum = numbers.reduce(getSum, initialValue);

In this example, our code only cares about the final value, and the steps that we take on an element to get there.

In this function, we have a total and a number . We also have an initialValue which sets the value of total for the first run of the function.

total is an aggregation. Whatever we return from our summarizing function becomes the new aggregated value. When we run out of elements to aggregate, the final value becomes the returned result of the reduce .

If we were to map the actions of the example into a table, where each row was running the function on the next element in the array, we would end up with the following:

| total | number | body | return | | -: | -: | :-: | -: | | initialValue |1| 0 + 1 |1| |1|2| 1 + 2 |3| |3|3| 3 + 3 |6| |6|4| 6 + 4 |10|

console.log(sum); // 10

The type of the final value does not need to match the type of the element. In this case, the type of total and the type of each number are Number. That is not always the case.

It’s okay if that seems unintuitive; we’ll get into more examples momentarily.

Inside of a reduce, you are given the current value of the accumulator, and the next element to be included in the summary. Whatever you return becomes the new value of the accumulator.

function reduce (summarize, initialValue, array) { let result = initialValue; for (let el of array) { result = summarize(result, el); } return result; }

Let’s break from the theory for a second, and look at some practical (if simplistic) problems which can be spotted in the wild.

Provide the sum of the elements in an array: const numbers = [1, 2, 3, 4]; // imperative example let sum = 0; for (let i = 0; i < numbers.length; i += 1) { const x = numbers[i]; sum += x; } // functional example const add = (accumulator, x) => accumulator + x; const sum = numbers.reduce(add, 0); // terse example const sum = numbers.reduce((x, y) => x + y, 0);

Note that once you know what reduce does, even the terse version can be very simple to understand.

Combine chunks of text into one string: const strings = ["a", "b", "c"]; // imperative example let str = ""; for (let i = 0; i < strings.length; i += 1) { const chunk = strings[i]; str += chunk; } // functional example const combine = (accumulator, x) => accumulator + x; const str = strings.reduce(combine, ""); // terse example const str = strings.reduce((x, y) => x + y, ""); Remove duplicate values: const duplicates = [1, 2, 3, 2, 3, 6, 4, 5]; // imperative let uniques = []; for (let i = 0; i < duplicates.length; i += 1) { let value = duplicates[i]; let found = false; for (let j = 0; j < uniques.length; j += 1) { let test = uniques[j]; if (value === test) { found = true; break; } } if (!found) { uniques.push(value); } } // human-friendly imperative let uniques = []; for (let i = 0; i < duplicates.length; i += 1) { let value = duplicates[i]; let found = uniques.indexOf(value) >= 0; if (!found) { uniques.push(value); } } // functional const includes = (value, array) => array.indexOf(value) >= 0; const includeFirstOccurrence = (array, value) => { const found = includes(value, array); return found ? array : array.concat(value); }; const uniques = duplicates.reduce(includeFirstOccurrence, []); // terse const includes = (value, array) => array.indexOf(value) >= 0; const uniques = duplicates.reduce((array, value) => includes(value, array) ? array : array.concat(value), []);

It’s key to note, here, that the type of each element is a Number, while the value of the accumulator is an Array. This is a clear example of the aggregator and the elements being of different types.

| array | value | return | | :-- | --: | :-- | | [] | 1 | [1] | | [1] | 2 | [1,2] | | [1,2] | 3 | [1,2,3] | | [1,2,3] |2| [1,2,3] | | [1,2,3] |3| [1,2,3] | | [1,2,3] |6| [1,2,3,6] | | [1,2,3,6] |4| [1,2,3,6,4] | | [1,2,3,6,4] |5| [1,2,3,6,4,5] | Flatten the contents of a 2D group of arrays:

As a preface to this problem, we often find ourselves with an array of stuff to deal with.

Sometimes, we find ourselves with two or more arrays of the same type of stuff, and what we really need is one array that combines it all.

We usually get there via array concatenation, and this is true in just about any language.

In more traditional cases we might say something like:

List<Item> items = new ArrayList<Item>(); items.addAll(itemsOne); items.addAll(itemsTwo); // ... items.addAll(items_N);

We typically did this line by line. If the list of arrays that you need to join just keeps growing, why not make it a list, itself, that you can loop through and join them all with fewer lines of code?

If that sounds confusing, don’t worry; we’re going to start with a simplified version for now, and see stronger use cases for this pattern at the end.

const groups = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ]; // imperative example let flattened = []; for (let i = 0; i < groups.length; i += 1) { const subgroup = groups[i]; for (let j = 0; j < subgroup.length; j += 1) { const x = subgroup[j]; flattened.push(x); } } // human-friendly imperative example let flattened = []; for (let i = 0; i < groups.length; i += 1) { const subgroup = groups[i]; flattened = flattened.concat(subgroup); } // functional example const concat = (array, group) => array.concat(group); const flattened = groups.reduce(concat, []); // terse example const flattened = groups.reduce((a, b) => a.concat(b), []); | array | group | return | | :-- | :-: | :-- | | [] | [1,2,3] | [1,2,3] | | [1,2,3] | [4,5,6] | [1,2,3,4,5,6] | | [1,2,3,4,5,6] | [7,8,9] | [1,2,3,4,5,6,7,8,9] | console.log(groups); // [[1,2,3],[4,5,6],[7,8,9]] console.log(flattened); // [1,2,3,4,5,6,7,8,9] What if you wanted to take a 2D array of numbers, increment each number by one, remove duplicates, and output as a string of the remaining numbers? const grid = [ [1, 2, 3], [3, 4, 5], [5, 6, 7], [7, 8, 9] ]; const includes = (value, array) => array.indexOf(value) >= 0; const concat = (a, b) => a.concat(b); const keepFirstOccurrence = (arr, el) => includes(el, arr) ? arr : arr.concat(el); const flattenAddOneToUniquesAndSerialize = grid => grid // make one big array .reduce(concat, []) // remove duplicates .reduce(keepFirstOccurrence, []) // add one to each of the remaining numbers .map(x => x + 1) // convert to strings .map(x => `${x}`) // concat strings together .reduce(concat, ""); const result = flattenAddOneToUniquesAndSerialize(grid);

Each of the functions used in the solution is happily reusable, and very easily testable.

The entire algorithm itself is equally testable, and really just six lines which are very focused on solving the problem at hand, instead of the looping and branching.

Also note that we used concat on both arrays and strings. We didn’t have to, of course, but here we took advantage of polymorphic reuse of that function.

What about more complex real-world problems?

Believe it or not, that’s where these techniques shine.

Problem 1: Accounting

You’re the lucky developer who was picked to help generate sales reports at your company, while the finance team switches accounting software.

They want three reports:

A list of sales teams, by team name and total revenue earned The total revenue made by the company, regardless of team The name of the employee who made the most money

You know that the data looks like:

const data = [ { // SalesTeam object name: "AB-Team", // team.name members: [ { // A SalesPerson object name: "Bob McKenzie", // team.members[0].name sales: 1000 // team.members[0].sales - in dollars }, // ... more sales people ] }, // ... more sales teams ]

So, how might you generate the reports?

Let’s start with the easiest, which might surprisingly be the sum of all teams.

// give me the total of all sales from // `data[i].members[j].sales` const getTotalRevenue = data => // a list of teams; each team has a list of members data // a list of lists of members .map(team => team.members) // a list of all members .reduce((a, b) => a.concat(b), []) // a list of all revenue numbers .map(person => person.sales) // adding all revenue together .reduce((total, sales) => total + sales, 0);

Next, let’s look at getting the person who sold the most.

If you consider what we’re doing, the last report isn’t all that different.

Rather than the sales data out of the final list, we need to keep track of the whole person. And we probably want a way of comparing two people.

Beyond that, the first part of the function should be exactly the same; we want to turn the data into a list of lists, and then flatten it.

// I’m going to use this fake person to compare // against the first salesperson. // this saves me from having special cases like // null-checks in my code const emptyPerson = { name: "Nobody", sales: -Infinity }; // give me the person with the highest sales so far const getBestBySales = (best, person) => person.sales > best.sales ? person : best; // give me the person with the highest sales of all const getBestSalesperson = data => data // go from a list of teams to a list of lists of people .map(team => team.members) // go from a list of lists of people to a list of people .reduce((a, b) => a.concat(b), []) // go from a list of people to a person .reduce(getBestBySales, emptyPerson);

Did you notice the potential bug? What happens if people tie for first place? You could solve that by keeping a list of all of the top performers and replacing it with a new list, if the next salesperson beat all of the ties. Regardless, it should be really easy to spot (and fix), now that the logic is all on its own.

The last report is going to be a little different, isn’t it. It’s different in that we need a list of teams. We already have a list of teams.

The list we need, though, has a name and a revenue, not a name and a list of members.

// give me a list of teams like `{ name: "...", revenue: ... }` const getRevenueByTeam = data => // make new objects with "name" and "revenue" properties data.map(team => ({ name: team.name, revenue: team.members // list of people to list of sales .map(person => person.sales) // list of sales to a number .reduce((total, sales) => total + sales, 0) }));

That’s all there is to it. We just went through and turned all of the inner arrays into numbers. We didn’t have to worry about anything else.

The outer map creates a new object for each team, and that map uses an inner map/reduce to calculate one of its properties.

You could extract the map/reduce into a separate function, and pass the array in. You could outsource the whole object construction to a separate function. Either of those is fine and frequently preferable. In this case, it’s still small, testable, and doing one job.

Problem 2: Word-Counting

Let’s say you’re working on a primitive site scraping system, and the sub-problem you are working on right now is getting out of hand.

You are being handed sets of pages from an arbitrary site, and the data team wants a dictionary of each word that appears on any page, and the number of times that word appears anywhere on the site.

Soon, you’re thinking about looping through pages, and in that loop, you’re looping through lines, and then you’re looping through words. This is exactly how I would have considered doing it in any language where I was expected to readLine .

for (let i = 0; i < pages.length; i += 1) { // ...prep page and lines for (let j = 0; j < lines.length; j += 1) { // ...prep line and words for (let k = 0; k < words.length; k += 1) { // clean and lower-case and add to frequency map } } }

But what if we thought about this differently?

For this feature, we don’t really need an understanding of lines or of pages. The ask is really just for words.

So we can start by thinking about how to arrive at one huge list of words, based on a set of pages given to us.

// concatenate two things together // A.concat(B) returns an object of type A const concat = (a, b) => a.concat(b); // I don’t care about lines. Just split on all whitespace. const getWordsFromPage = page => page.trim().split(/s+/); // Clean the punctuation off of a word. const removePunctuation = word => word.replace(PUNCTUATION_REGEX, ""); // Increase the frequency of this word, in the dictionary. const calculateFrequency = (dictionary, word) => { const count = dictionary[word] || 0; dictionary[word] = count + 1; return dictionary; }; // Give me one clean, lower-case list made up of // all of the words on the site. const words = pages // lowercase the page, so the frequency is accurate .map(page => page.toLowerCase()) // turn each page into a list of words .map(getWordsFromPage) // turn the list of lists of words into a list of words .reduce(concat, []) // clean words .map(removePunctuation); // populate the dictionary const frequencyMap = words.reduce(calculateFrequency, { });

“But wait; couldn’t we also just concatenate all of the pages into one big page and work from there?”

Well, we could. But as soon as we concatenate all of those together, into one big string, it's no longer mappable.

// turn all pages into one big string const page = pages.reduce(concat, ""); // lowercase and split into words const frequency = getWordsFromPage(page.toLowerCase()) .map(removePunctuation) // remove punctuation from words .reduce(calculateFrequency, {}); // populate the dictionary

Notice that we turn it into a string, and then turn it right back into an array, to continue on.

There’s also the potential for a serious error in this example, depending on the formatting of pages of text.

If there is no whitespace at the start or end of any page, then the first word of a page will be added to the last word of the previous page. This can be solved with a slightly more complex reducer, but is a problem that we didn’t have to confront in the naive, set-based case.

There’s nothing wrong with this, it’s merely a different way of looking at the solution of the problem. And hopefully, that’s what this post has provided: a different way of looking at problems.

Learn More

If you're looking for more in-depth tutorials and learning opportunities, check out our various JavaScript training options . Better yet, inquire about custom training to ramp up your knowledge of the fundamentals and best practices with custom course material designed and delivered to address your immediate needs.

本文前端(javascript)相关术语:javascript是什么意思 javascript下载 javascript权威指南 javascript基础教程 javascript 正则表达式 javascript设计模式 javascript高级程序设计 精通javascript javascript教程

主题: JavaScriptJavaWordUATI
分页:12
转载请注明
本文标题:Map/Reducing the Pain of Dealing With Arrays
本站链接:http://www.codesec.net/view/534354.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 前端(javascript) | 评论(0) | 阅读(12)