Concatenating PDF with Python
March 05, 2009 at 02:33 PM | categories: python, code | View Comments
I need to concatenate a set of PDFs, I will take you through my standard issue Python development approach when doing something I've never done before in Python.
My first instinct was to google for pyPDF. Success! So, fore go reading any doc and just give the old easy_install a try.
$ sudo easy_install pypdf
Another success! Ok, a couple help() calls later and I am ready to go. The end result is surprisingly small and seems to run fast enough even for PDFs with 50+ pages.
from pyPdf import PdfFileWriter, PdfFileReader
def append_pdf(input,output):
[output.addPage(input.getPage(page_num)) for page_num in range(input.numPages)]
output = PdfFileWriter()
append_pdf(PdfFileReader(file("sample.pdf","rb")),output)
append_pdf(PdfFileReader(file("sample.pdf","rb")),output)
output.write(file("combined.pdf","wb"))
Pylons and long-live AJAX request.
February 04, 2009 at 07:11 PM | categories: python, code, pylons | View Comments
[/sourcecode]
Now the controller code itself.
[sourcecode language="python"]
class ServerController(BaseController):
def index(self):
response.headers['Content-type'] = 'multipart/x-mixed-replace;boundary=test'
return data_stream()
def data_stream(stream=True):
yield datetime_string()
while stream:
time.sleep(5)
yield datetime_string()
@memorize(duration=15)
def datetime_string():
content = '--test\nContent-type: application/xml\n\n'
content += '\n'
content += '
Code: Buffer CGI file uploads in Windows
August 19, 2008 at 04:41 PM | categories: python, code | View CommentsNote to self, when handling CGI file uploads on a Windows machine, you need the following boiler plate to properly handler binary files. [sourcecode language='python'] try: # Windows needs stdio set for binary mode. import msvcrt msvcrt.setmode (0, os.O_BINARY) # stdin = 0 msvcrt.setmode (1, os.O_BINARY) # stdout = 1 except ImportError: pass [/sourcecode] Also, if you're handling very large files and don't want to eat up all your memory saving them using the copyfileobj method, you can use a generator to buffer read and write the file. [sourcecode language='python'] def buffer(f, sz=1024): while True: chunk = f.read(sz) if not chunk: break yield chunk # then use it like this ... for chunk in buffer(fp.file) [/sourcecode]
Code: Saving in memory file to disk
July 16, 2008 at 06:27 PM | categories: python, code | View CommentsOkay, I discovered this today when looking to increase the speed at which uploaded documents were saved to disk. Now, I can't explain the inner workings of why it is fast(er), all I know is that with the exact same form upload test ran 100 times with a 25MB file over a 100Mbit/s network this method was on average a whole 2.3 seconds faster over traditional methods of write, writelines, etc.. How does this extend to real-world production usage over external networks, well no idea. Though I plan to find out. So you all will be the first to know as soon as I find some guinea pig site that does enough file uploads to implement this on. [sourcecode language='python'] # Minus some boiler plate for validity and variable setup. import os import shutil memory_file = request.POST['upload'] disk_file = open(os.path.join(save_folder, save_name),'w') shutil.copyfileobj(memory_file.file, disk_file) disk_file.close() [/sourcecode]