在这里详述 python/import data to postgresql from mysql。

{{{#!highlight python

!/usr/bin/env python

coding=utf-8

import MySQLdb, psycopg2 import sys

psycopg2.paramstyle='qmark' #psycopg2.paramstyle 失效 ,全用%s

pconn = psycopg2.connect(host='172.16.147.133', user='postgres', password='l', database='address') pc = pconn.cursor()

def insert(row, table): lens = len(row) str = "insert into %s values ...

more ...

python webkit scrapy

ubuntu apt-get 安装pythonwebkit,jswebkit {{{ apt-get install python-webkit jswebkit }}} debian {{{ apt-get install python-jswebkit python-webkit }}} 在scrapy的settings.py中加入: {{{#!highlight python

which spider should use WEBKIT

WEBKIT_DOWNLOADER=['jxydt'] DOWNLOADER_MIDDLEWARES = { 'jx.dowloader.WebkitDownloader': 543, }

import os os.environ["DISPLAY"] = ":0" }}} dowloader.py {{{#!highlight python

!/usr/bin/env python

-- coding: utf-8 --

from scrapy.http import Request ...

more ...

Describe python/virtualenv pip 技巧 here.

== virtualenv == 这里是导言吗? 用过Python的同学,肯定会对Python及程序的版本之间经常更换的api感到痛苦不以。就拿我折腾的Django来说吧,公司服务器上跑的是Django1.3、同事也是用1.3开发,但是因为我是新来,一个 pip install django 下去,就是1.4.2。好了,你自己写的Django Project自然没有问题,自己本地测试也没有问题。但是要和其他人交流的时候就蛋疼了,因为你的1.4.2跑不了1.3的程序……当然,这时,你可以选择卸载自己本地的Django,换成1.3,等你要重新测试自己的Django,怎么样,扯着蛋了吧。为了解决以上问题,virtualenv横空出世了。 正文 为了解决以上蛋疼问题,我们需要安装virtualenv。 {{{ sudo pip install virtualenv }}} 安装好了以后,就可以在任何目录下新建一个virtual-environment(我更喜欢叫:盗梦空间),当然一般我习惯在项目的边上创建一个$project_name-env ...

more ...

== python/Python 排列,组合生成器 yield ==

参数数字,固定排列和组合字符串的长度

f=perm('abcdefghijklmnopqrstuvwxyz01234567890', 4)

{{{#!highlight python

生成全排列

def perm(items, n=None): if n is None: n = len(items) for i in range(len(items)): v = items[i:i+1] if n == 1: yield v else: rest = items[:i] + items[i+1:] for p in perm(rest, n-1 ...

more ...

Describe python/PIL decoder jpeg not available here.

{{{#!highlight bash apt-get install libjpeg-dev

pt-get install libfreetype6-dev }}}

Ubuntu11.10上,libjpeg.so、libz.so和libfreetype.so都在路径/usr/lib/x86_64-linux-gnu下,在/usr/lib下为这3个包创建软链接后执行ldconfig

在路径/usr/lib和/usr/local下查找PIL相关文件和目录,全部删除

Ubuntu11.10上,libjpeg.so、libz.so和libfreetype.so都在路径/usr/lib/x86_64-linux-gnu下,在/usr/lib下为这3个包创建软链接后执行ldconfig

删除PIL解压后的安装包Imaging-1.1.7,重新解压,并修改setup.py,或者pip安装 {{{#!highlight bash rm ...

more ...

Describe python/scrapy error TypeError: __init__() got an unexpected keyword argument ‘_job’ here.

{{{ 2012-12-28 05:57:33+0800 [Launcher,29590/stderr] main() File “/usr/local/lib/python2.7/dist-packages/scrapyd/runner.py”, line 36, in main execute() File “/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py”, line 131, in execute _run_print_help(parser, _run_command, cmd, args, opts) File “/usr/local/lib/python2.7 ...

more ...

Describe python/scrapy及PIL安装 here.

{{{ aptitude install python-setuptools python2.7-dev easy_install pip aptitude install libxslt-dev aptitude install libxml2-dev

pip install scrapy

pip instlal sqlalchemy

pip install PIL

}}} Ubuntu11.10上,libjpeg.so、libz.so和libfreetype.so都在路径/usr/lib/x86_64-linux-gnu下,在/usr/lib下为这3个包创建软链接后执行ldconfig

more ...

Scrapy imagePipelines 图片保存路径重写

只需要重写image_key方法即可 {{{#!highlight python from scrapy.contrib.pipeline.images import ImagesPipeline import hashlib from datetime import datetime class MyImagesPipeline(ImagesPipeline): def image_key(self, url): image_guid = hashlib.sha1(url).hexdigest() #return 'full/%s.jpg' % (image_guid) path = datetime.now().strftime("%Y/%m/%d") return '%s/%s.jpg' % (path, image_guid)

}}}

more ...

moinmoin ubuntu nginx

vi /etc/rc.local ,在exit 0 前面添加 {{{ uwsgi -x /www/moin/uwsgi.xml >/dev/null 2>&1 & }}}

cat /etc/nginx/sites-enabled/wiki {{{ server { listen 4081; access_log /dev/null; location / { include uwsgi_params; uwsgi_pass unix:///www/moin/moin.sock; uwsgi_modifier1 30; } }

}}}

cat moin/uwsgi.xml {{{ www-data www-data python /www/moin/moin.sock ...

more ...

python用abiword将doc转换成html,rtf,pdf...

{{{ sudo apt-get install abiword -y abiword -t output.html resume.doc # 将word转换成html }}}

{{{#!highlight python import subprocess
import os
import uuid

def document_to_html(file_path):  
    tmp = "/tmp"  
    guid = str(uuid.uuid1())  
    # convert the file, using a temporary file w/ a random name  
    command = "abiword -t %(tmp)s/%(guid)s.html %(file_path)s ...
more ...