未加星标

multiprocessing could even slower than single thread

字体大小 | |
[系统(linux) 所属分类 系统(linux) | 发布者 店小二04 | 时间 2017 | 作者 红领巾 ] 0人收藏点击收藏

I have been always using multiprocessing in my web crawlers to accelerate the processing. Today, I want test how much fast the storageEngine WiredTiger vs MMAPV1 in MongoDB. The result is what expected for inserting document using single thread: 20% faster. As WiredTiger has a lower level lock (document) than MMAPV1 (collection), and support multiple-core CPU, so I test also the speed of inserting using multiprocessing. The result surprised me: 14% slower than single thread.

Let’s show the result first:

Inserting speed:


multiprocessing could even slower than single thread

Database size:


multiprocessing could even slower than single thread
Conclusion: use WiredTiger, it’s faster and use much less disk space.

Here is the python script:

#!/usr/bin/env python # -*- coding: utf-8 -*- import re, sys, os, time from pymongo import MongoClient from multiprocessing.dummy import Pool ##connect to default MongoClient client = MongoClient() db = client['MM'] IDs = range(1000000) def writeSample(i): item = {"name": "Huidong Tian", "title": "PhD", "id": i, "des": "I am a data scientist focusing on Python and R", "other": "As for why the book was created, a theory which has gained considerable interest, although still controversial is Persian imperial authorisation. "} collection.insert(item) msg = "\rwrite " + str(i) + " documents!" sys.stdout.write(msg); sys.stdout.flush() collection = db['single'] t1 = time.time() for i in IDs: writeSample(i) print "\nImporting used " + str(int(time.time() - t1)) + " seconds in total!\n" collection = db['multiple'] t1 = time.time() pool = Pool(8) pool.map(writeSample, IDs) pool.close() pool.join() print "\nImporting used " + str(int(time.time() - t1)) + " seconds in total!\n"

By default, the storage engine of MongoDB 3.4 is WiredTiger, to change to mmap1v, follow the steps:

We need to modify the configure file:/etc/mongod.confto specify a new location andmmapv1engine:

dbPath: /home/tian/3T/db_mm engine: mmapv1

Note: there is a space after “:”, otherwise, mongod will not start.

The dbPath should be accessed by mongodb, to make ensure that use the command to change its owner:group and access permission:

sudo chown mongodb:mongodb /home/tian/3T/db_mm sudo chmod 755 /home/tian/3T/db_mm Stop current mongod service.

Using sudo service mongod stop may not always works, if so, simply kill the pid of mongod (using top|grep mongod to find the pid of mongod

sudo kill xxxx Start mongod service sudo service mongod start

Now, type mongo in terminal, we should can connect the mongod server.

本文系统(linux)相关术语:linux系统 鸟哥的linux私房菜 linux命令大全 linux操作系统

主题: MongoDBCPUPython
分页:12
转载请注明
本文标题:multiprocessing could even slower than single thread
本站链接:http://www.codesec.net/view/530930.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 系统(linux) | 评论(0) | 阅读(62)