未加星标

Stroke Width Transform algorithm for Python

字体大小 | |
[开发(python) 所属分类 开发(python) | 发布者 店小二03 | 时间 2017 | 作者 红领巾 ] 0人收藏点击收藏

Stroke Width Transform ( SWT ) is a computer vision algorithm (actually, it's an image operator) which can be used in the task of detecting text in images. This is a non-trivial task, especially for camera pictures, but SWT performs pretty well in this field.

TL;DR

How to wrap libccv computer vision library for using SWT in python. Wrapper was built using SWIG.

Intro

My usecase for the SWT was the following: split big dataset of unknown images into ones that contain text (most likely documents / scans) and others . Easy, right? Yes, as long as you have fast implementation of this algorithm. Unfortunately, I needed to use Python, because the image filtering was only a part of bigger solution. I search here and there and of course, there are some implementations in Python, but they are not ready for production. One of them i.e took 5 minutes to process 2000px x 3000px image and it allocated gigabytes of memory. Just a single image. NOPE for that.

So... I searched further, but finnaly I haven't found anything either working or performat. I decided to find any other language implementation of SWT and I've found libccv . Quick look at the repository - it's written in C. OK, not my kind of language, but I gave it a try and surely, it will be fast. I've compiled example aplication which took image as an input and producted output in the form of 4-element tuples (x, y, width, height) , each of which was an rectangle coordinates in which the text on the image is. It produced satysfying results, so I decided to use it from Python. How? Let's dig into it!

Prerequisites Python 2.7.* on linux (tested on Ubuntu 16.04 x64) Python packages: scikit-image, numpy, matplotlib (only for preview) gcc compiler SWIG basic knowledge of C SWT example breakdown from libccv

For starters, take a look at the code from https://github.com/liuliu/ccv/blob/07fc691c5344940751011c3af96d0ab202b1b4e6/bin/swtdetect.c The code comes from libccv repository and it's an example of using SWT text detection. It is a console application, which takes file name as an argument, runs the algorithm and outputs coordinates of rectangles with text to standard output.

The probles with it are the following:

it is a console application it takes file name as an input it outputs to stdout

My requirements for the function (which I wanted to call from Python) were the following:

it is a standalone library / module it takes blob of an image as an input (do not write anything to disk) it outputs returning array-like object SWT function wrapper in C

My requirements are pretty simple to transform to function header:

int* swt(char *bytes, int array_length, int width, int height);

Compiling the code above to plain english: take an array of bytes, it's length (welcome to C world!), width and height of an image, do some magic and output array of numbers.

In order to implement and compile the wrapper without issues, just checkout libccv from it's repository and put your files in libccv/lib folder. This approach is suggested by libccv author.

Implementation of SWT for Python wrapper: #include "ccv.h" #include <jpeglib.h> #include "io/_ccv_io_libjpeg.c" #include <sys/time.h> #include <ctype.h> #define SUCCESS 1 #define FAILURE 0 int* swt(char *bytes, int array_length, int width, int height){ int *result_array; int status = FAILURE; ccv_enable_default_cache(); ccv_dense_matrix_t* image = 0; FILE *stream; stream = fmemopen(bytes, array_length, "r"); if(stream != NULL){ int type = CCV_IO_JPEG_FILE | CCV_IO_GRAY; int ctype = (type & 0xF00) ? CCV_8U | ((type & 0xF00) >> 8) : 0; _ccv_read_jpeg_fd(stream, , ctype); if (image != 0) { ccv_array_t* words = ccv_swt_detect_words(image, ccv_swt_default_params); if (words) { int i; int result_idx = 1; result_array = (int*)malloc((4 * words->rnum + 1) * sizeof(int)); result_array[0] = 4 * words->rnum; for (i = 0; i < words->rnum; i++) { ccv_rect_t* rect = (ccv_rect_t*)ccv_array_get(words, i); result_array[result_idx++] = rect->x; result_array[result_idx++] = rect->y; result_array[result_idx++] = rect->width; result_array[result_idx++] = rect->height; } ccv_array_free(words); status = SUCCESS; } ccv_matrix_free(image); } ccv_drain_cache(); } if(status != SUCCESS){ result_array = (int*)malloc(1 * sizeof(int)); result_array[0] = 0; } return result_array; } Explanation

There are some hacky things in there, I will explain them one by one.

FILE *stream; stream = fmemopen(bytes, array_length, "r");

My function takes array of bytes as an input, but none of the ccv_read function overloads accepts this kind of data. I've found great fmemopen standard C function, which transforms char* buffer into file handler. But... none of the ccv_read function overloads accept this kind of data (AGAIN!). Yeah, but they read the JPEGs from files somehow!

That lead me to next part:

#include <jpeglib.h> #include "io/_ccv_io_libjpeg.c"

In the _ccv_io_libjpeg.c file, there's a function which accepts FILE* handler and reads JPEG image into libccv image format. I wanted to use it.

int type = CCV_IO_JPEG_FILE | CCV_IO_GRAY; int ctype = (type & 0xF00) ? CCV_8U | ((type & 0xF00) >> 8) : 0; _ccv_read_jpeg_fd(stream, , ctype);

Libccv needs to have grayscale image as an input for SWT, thus some bitwise magic to set proper flags.

int *results_array; result_array = (int*)malloc((4 * words->rnum + 1) * sizeof(int)); result_array[0] = 4 * words->rnum;

Dynamic arrays were (and still are) such pain in C. SWT outputs 4 numbers for every match, so the size of the array should be 4 times bigger. The +1 is my hack to output everyhing in a single array and read it in Python. I used this one additional index to output the length of the whole array (I will drop this value in the upcoming SWIG wrapper, wait for it!).

if(status != SUCCESS){ result_array = (int*)malloc(1 * sizeof(int)); result_array[0] = 0; }

When something goes wrong, I wanted to output something anyway, so I just create 1 element array, with 0, meaning, there is nothing to push to Python from the wrapper.

Python wrapper for SWT using SWIG

SWIG ( http://www.swig.org/ ) seems to be really old, but it does the job pretty well (with really small amount of code). Plus it's still active in 2017. It allows to wrap any C code into Python module really fast - that's exactly what I wanted. You can install SWIG by using the following command:

sudo apt-get install swig

In order to build the wrapper, you need to create SWIG interface file (I named it ccvwrapper.i ):

%module ccvwrapper %{ #include "ccvwrapper.h" %} %typemap(out) int* swt { int i;

本文开发(python)相关术语:python基础教程 python多线程 web开发工程师 软件开发工程师 软件开发流程

主题: PythonLinuxUbuntuUCSU
分页:12
转载请注明
本文标题:Stroke Width Transform algorithm for Python
本站链接:http://www.codesec.net/view/561308.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 开发(python) | 评论(0) | 阅读(53)