How do you string search ?

I got my hands on angular.js, I was surprise by how MVC javascript frameworks are now trying to “semantify” the HTML via directives. Really cool thing and with a promising future, and of course you see an ever increasing number of web platforms opening up API, but we know, that there is so much interesting data you cannot just query from an API and what do you do? You end up scraping, parsing logs, applying regex, etc.

Possible approaches for String Matching

You can probably take few of these approach to perform quick search on API retrieved data :

Manual Search - verbatim match, matching exact string, but it can become expensive as you scale. Regex - Regex can become hell, if you have a closer look at your data, you might define regular expressions to extract parts of the potential matching keys (e.g.: gsub(‘.*?([0-9]+GB).*’,’\1′, ‘Apple iPhone 16GB black’) to extract the number of memory GB in the name of a device and trying to match by several fields, not just by one). But there are so many special cases to consider, that you might well end up in a “regex” hell. Fuzzy Search Mark Van der Loo released a package called stringdist with additional popular fuzzy string matching methods, which we are going to use in our example below. These fuzzy string matching methods don’t know anything about your data, but you might do. In Fuzzy logic, you want to have an approximate distance between the shorter key and portions of similar number of words of the larger key to decide whether there’s a match. There are three variants involved in fuzzy search. Basically the process is done in three steps: Reading the data from both sources Computing the distance matrix between all elements Pairing the elements with the minimum distance

So you can have a look at the three variants in R. Basically the process is done in three steps:

Reading the data from both sources Computing the distance matrix between all elements Pairing the elements with the minimum distance Introducing Fuse.IO

Kirollos Risk have written javascript implementation of Fuzzy Search in clean fashion and lightweight library for Fuzzy Search. Simple you have to define your gravity (R variants) of the search and provide JSON data set, checkout library for more

Example shown below gives you two option to match by (exact string) or by gravity (variant) to stress more on one key over other.

Searching by ID and Gravity

In the two implementation below, you can simply search by matching keyword (ID) ISBN in this case, in case you want to add more relevancy to result set, you can weighted search by adding (R-variant) as shown in the picture below

Searching Array of String and Array of Objects

Searching array of string and objects also is pretty straightforward, you can simple define your relevancy in search and result behaves accordingly

Coding fuzzy logic for a data set

Lets us code fuzzy logic for sample data set, for the sake of implementation, I researched few open data sets and found country data to be interesting to search and here is my JSON data hosted on myjson server .

tags: data,search,string,Search,matching,fuzzy

1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责；
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性，不作出任何保证或承若；
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。