## 行數計算

 1 count = len(open(file_name,'rb').readlines()) 

 123456789101112 start = time.time() count = 0 with open(file_name, 'rb') as f : while True: lines = f.readlines(1024*8192) if not lines: break count += len(lines) endtime = time.time() print (count, (endtime - start)) 

 1234567891011 start = time.time() count = 0 with open(file_name, 'rb') as f : while True: line = f.readline() if not line: break count +=1 endtime = time.time() print (count, (endtime - start)) 

Python 讀檔三寶除了 readline()readlines() 外，有個 read()，接下來試試用 read()。不過 read() 讀進來的也是整份文件，為了不讓記憶體爆掉，也是設了 chunks size。

 12345678910 start = time.time() count = 0 with open(file_name, "rb") as reader: while True: data = reader.read(1024*8192) if not data: break count += data.count(b'\n') endtime = time.time() print ((endtime - start)) 

### 迭代器

 12345678 start = time.time() count = 0 with open(file_name,'rb') as f : for line in f : count += 1 endtime = time.time() print ((endtime - start)) 

 12345678 start = time.time() count=-1 for count, line in enumerate(open(file_name,'rb')): pass count+=1 endtime = time.time() print (endtime - start) 

## Multiprocess

 123456789101112131415161718192021 import multiprocessing as mp from itertools import (takewhile,repeat) def count_lines(file_name): count = 0 with open(file_name,'rb') as f: f = open(file_name, 'rb') bufgen = takewhile(lambda x: x, (f.raw.read(1024 * 1024) for _ in repeat(None))) count += sum(buf.count(b'\n') for buf in bufgen if buf) return count start = time.time() pool = mp.Pool(processes=4) asyncResult = pool.map_async(count_lines, file_names) count = sum(asyncResult.get()) endtime = time.time() print (count, (endtime - start)) 

P.S. 寫到這邊才想到，我又不重 CPU 計算的部份，應該開 Multithread 就好，不用開到 Multiprocess。

## 參考資料

1. vmele (2017-12-06)。optimization - Optimize file and number line count in Pytho 。檢自 Stack Overflow (2020-06-18)。

## 更新紀錄

• 2020-08-10 發布
•
• 2020-06-18 完稿