未加星标

Playing back a binary file using Python

字体大小 | |
[开发(python) 所属分类 开发(python) | 发布者 店小二04 | 时间 2018 | 作者 红领巾 ] 0人收藏点击收藏

I try to read a file backwards (from end to begin). The example below does this, but I would like to ask the community - is there a more elegant solution to my question?

import os, binascii CHUNK = 10 #read file by blocks (big size) src_file_path = 'd:\\src\\python\\test\\main.zip' src_file_size = os.path.getsize(src_file_path) src_file = open(src_file_path, 'rb') #open in binary mode while src_file_size > 0: #read file from last byte to first :) if src_file_size > CHUNK: src_file.seek(src_file_size - CHUNK) byte_list = src_file.read(CHUNK) else: src_file.seek(0) byte_list = src_file.read(src_file_size) s = binascii.hexlify(byte_list) #convert '\xFB' -> 'FB' byte_list = [(chr(s[i]) + chr(s[i+1])) for i in range(0, len(s), 2)] #split, note below print(byte_list[::-1]) #output reverse list src_file_size = src_file_size - CHUNK src_file.close() #close file

UPDI would like to know the opinion of experts - what do I need to pay attention as newbie in Python? Is there a potential flaw in this code?

Thanks in advance.

I'm using Python 3.3.1 Note: split by bytes from here!

I can see several things to be improved in the code from the question. Firstly, the while loop is rarely used in Python because there is almost always better way to express the same using the for loop or using some built-in functions.

I guess the code is purely for a training purpose or so. Otherwise, I would ask first what is the real goal (because knowing the problem, the better solution may be very different than the first idea).

The goal here is to get the positions for the seek . You know the size, you know the chunk size, you want to go backwards. There is the built-in generator for the purpose in Python named range . A single argument is mostly used; however, range(start, stop, step) is the full form. The generator can be iterated in the for loop, or you can use the values say to build a list of them (but you often do not need the later case). The positions for the seek can be generated like this:

chunk = 10 sz = 235 lst = list(range(sz - chunk, 0, -chunk)) print(lst)

I.e., you start from sz - chunk position, stop at zero (not often) using the negative value for the next generated value. Here the list() iterates through all the values and builds the list of them. But you can iterate directly through the generated values:

for pos in range(sz - chunk, 0, -chunk): print('seek({}) and read({})'.format(pos, chunk)) if pos > 0: print('seek({}) and read({})'.format(0, pos))

The last generated position is or zero or positive. This way, the last if processes the last portion when it is shorter than chunk . Putting the above code together, it prints:

c:\tmp\_Python\wikicsm\so16443185>py a.py [225, 215, 205, 195, 185, 175, 165, 155, 145, 135, 125, 115, 105, 95, 85, 75, 65, 55, 45, 35, 25, 15, 5] seek(225) and read(10) seek(215) and read(10) seek(205) and read(10) seek(195) and read(10) seek(185) and read(10) seek(175) and read(10) seek(165) and read(10) seek(155) and read(10) seek(145) and read(10) seek(135) and read(10) seek(125) and read(10) seek(115) and read(10) seek(105) and read(10) seek(95) and read(10) seek(85) and read(10) seek(75) and read(10) seek(65) and read(10) seek(55) and read(10) seek(45) and read(10) seek(35) and read(10) seek(25) and read(10) seek(15) and read(10) seek(5) and read(10) seek(0) and read(5)

I personally would replace the print 's by calling the function that would take the file object, pos, and the chunk size. Here the faked body to produce the same prints:

#!python3 import os def processChunk(f, pos, chunk_size): print('faked f: seek({}) and read({})'.format(pos, chunk_size)) fname = 'a.txt' sz = os.path.getsize(fname) # not checking existence for simplicity chunk = 16 with open(fname, 'rb') as f: for pos in range(sz - chunk, 0, -chunk): processChunk(f, pos, chunk) if pos > 0: processChunk(f, 0, pos)

The with construct is another one good to learn. (Warning, nothing similar to Pascal's with .) It closes the file object automatically after the block ends. Notice that the code below the with is more readable and need not to be changed in future. The processChunk will be developed further:

def processChunk(f, pos, chunk_size): f.seek(pos) s = binascii.hexlify(f.read(chunk_size)) print(s)

or you can change it slightly so that its result is a reversed hexdump (the full code tested on my computer):

#!python3 import binascii import os def processChunk(f, pos, chunk_size): f.seek(pos) b = f.read(chunk_size) b1 = b[:8] # first 8 bytes b2 = b[8:] # the rest s1 = ' '.join('{:02x}'.format(x) for x in b1) s2 = ' '.join('{:02x}'.format(x) for x in b2) print('{:08x}:'.format(pos), s1, '|', s2) fname = 'a.txt' sz = os.path.getsize(fname) # not checking existence for simplicity chunk = 16 with open(fname, 'rb') as f: for pos in range(sz - chunk, 0, -chunk): processChunk(f, pos, chunk) if pos > 0: processChunk(f, 0, pos)

When a.txt is the copy of the last code, it produces:

c:\tmp\_Python\wikicsm\so16443185>py d.py 00000274: 75 6e 6b 28 66 2c 20 30 | 2c 20 70 6f 73 29 0d 0a 00000264: 20 20 20 20 20 20 20 70 | 72 6f 63 65 73 73 43 68 00000254: 20 20 69 66 20 70 6f 73 | 20 3e 20 30 3a 0d 0a 20 00000244: 6f 73 2c 20 63 68 75 6e | 6b 29 0d 0a 0d 0a 20 20 00000234: 72 6f 63 65 73 73 43 68 | 75 6e 6b 28 66 2c 20 70 00000224: 75 6e 6b 29 3a 0d 0a 20 | 20 20 20 20 20 20 20 70 00000214: 20 2d 20 63 68 75 6e 6b | 2c 20 30 2c 20 2d 63 68 00000204: 20 70 6f 73 20 69 6e 20 | 72 61 6e 67 65 28 73 7a 000001f4: 61 73 20 66 3a 0d 0a 0d | 0a 20 20 20 20 66 6f 72 000001e4: 65 6e 28 66 6e 61 6d 65 | 2c 20 27 72 62 27 29 20 000001d4: 20 3d 20 31 36 0d 0a 0d | 0a 77 69 74 68 20 6f 70 000001c4: 69 6d 70 6c 69 63 69 74 | 79 0d 0a 63 68 75 6e 6b 000001b4: 20 65 78 69 73 74 65 6e | 63 65 20 66 6f 72 20 73 000001a4: 20 20 23 20 6e 6f 74 20 | 63 68 65 63 6b 69 6e 67 00000194: 65 74 73 69 7a 65 28 66 | 6e 61 6d 65 29 20 20 20 00000184: 0d 0a 73 7a 20 3d 20 6f | 73 2e 70 61 74 68 2e 67 00000174: 0a 66 6e 61 6d 65 20 3d | 20 27 61 2e 74 78 74 27 00000164: 31 2c 20 27 7c 27 2c 20 | 73 32 29 0d 0a 0d 0a 0d 00000154: 27 2e 66 6f 72 6d 61 74 | 28 70 6f 73 29 2c 20 73 00000144: 20 20 70 72 69 6e 74 28 | 27 7b 3a 30 38 78 7d 3a 00000134: 66 6f 72 20 78 20 69 6e | 20 62 32 29 0d 0a 20 20 00000124: 30 32 78 7d 27 2e 66 6f | 72 6d 61 74 28 78 29 20 00000114: 32 20 3d 20 27 20 27 2e | 6a 6f 69 6e 28 27 7b 3a 00000104: 20 78 20 69 6e 20 62 31 | 29 0d 0a 20 20 20 20 73 000000f4: 7d 27 2e 66 6f 72 6d 61 | 74 28 78 29 20 66 6f 72 000000e4: 20 27 20 27 2e 6a 6f 69 | 6e 28 27 7b 3a 30 32 78 000000d4: 65 20 72 65 73 74 0d 0a | 20 20 20 20 73 31 20 3d 000000c4: 20 20 20 20 20 20 20 20 | 20 20 20 20 23 20 74 68 000000b4: 62 32 20 3d 20 62 5b 38 | 3a 5d 20 20 20 20 20 20 000000a4: 73 74 20 38 20 62 79 74 | 65 73 0d 0a 20 20 20 20 00000094: 20 20 20 20 20 20 20 20 | 20 20 20 23 20 66 69 72 00000084: 31 20 3d 20 62 5b 3a 38 | 5d 20 20 20 20 20 20 20 00000074: 75 6e 6b 5f 73 69 7a 65 | 29 0d 0a 20 20 20 20 62 00000064: 20 20 20 62 20 3d 20 66 | 2e 72 65 61 64 28 63 68 00000054: 20 20 66 2e 73 65 65 6b | 28 70 6f 73 29 0d 0a 20 00000044: 63 68 75 6e 6b 5f 73 69 | 7a 65 29 3a 0d 0a 20 20 00000034: 73 73 43 68 75 6e 6b 28 | 66 2c 20 70 6f 73 2c 20 00000024: 20 6f 73 0d 0a 0d 0a 64 | 65 66 20 70 72 6f 63 65 00000014: 62 69 6e 61 73 63 69 69 | 0d 0a 69 6d 70 6f 72 74 00000004: 74 68 6f 6e 33 0d 0a 0d | 0a 69 6d 70 6f 72 74 20 00000000: 23 21 70 79 |

For the src_file_path = 'd:\\src\\python\\test\\main.zip' , you can use forward slashes like src_file_path = 'd:/src/python/test/main.zip' also in windows. Or you can use raw strings like src_file_path = r'd:\src\python\test\main.zip'. The last case is used when you need to avoid doubling backslashes -- often when writing regular expresions.

本文开发(python)相关术语:python基础教程 python多线程 web开发工程师 软件开发工程师 软件开发流程

tags: seek,read,file,6f,0a,0d,6e,src
分页:12
转载请注明
本文标题:Playing back a binary file using Python
本站链接:https://www.codesec.net/view/597088.html


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 开发(python) | 评论(0) | 阅读(12)