Snowdream

阅读进度

10.31: CLRS 22.1-22.3 Elementary Graph Algorithms
10.25: CLRS 18.2 Operations on B-trees
10.22: CLRS 18.1 Definition of B-trees
10.15: CLRS 16.3 Huffman codes
10.14: CLRS 16.2 Elements of the greedy strategy
10.13: CLRS 16.1 An activity-selection problem

10.11-12: LKD 调度 O(1)调度算法
10.10: 虎书看完Abstract Syntax
10.7~9: LKD 第二章进程管理看完，不过还是没什么感觉，看来代码读的不够多
10.3: 虎书 LR Parsing 看到Error Recovery之前
9.24-25: CLRS 15 Dynamic Programming 看完，习题未做
9.22: CSAPP Chapter1 除浮点部分回顾了一遍
9.19: CSAPP 6.3 The Memory Hierarchy
9.18: CSAPP 6.2 Locality

总进度：
CS: APP
Chapter1(Tour) 泛读一遍
Chapter2(Representing and Manipulating) 除浮点部分已看完
Chapter3(Machine-Level Representation of Programs) 除*部分已看完
Chapter6(The Memory Hierarchy) 正在看，跳过第一节Storage Technologies
Chapter7(Linking) 看过一遍，Symbols and Symbol Tables, Relocation部分还不怎么清楚
Chapter8(Exceptional Control Flow) 看完
Chapter10(Virtual Memory) 看过一点，发现不知道Locality后跳到第6章

CLRS
Part I: Foundation 粗略的看了一遍，主要了解了下Big-Oh Big-Omega Big-Theta的概念，Master Method的应用和简单的Generation Function
Part II: Sorting and Order Statistics 除复杂度证明部分外看了一遍，大多数习题都看过
Part III: Data Structures 翻过一遍，*部分都没看，习题看的不多，红黑书相关的操作还不怎么熟练，后面两章还要再看一下
Part IV: Advanced Design and Analysis Techniques 跳过Amortized Analysis，做了部分习题
Part V: Advanced Data Structures 看了一点B-Tree，二分堆、Fibonacci堆和并查集先跳过了
PartVI: Graph Algorithms 正在看

Modern Compilers Implementation in C
从头看到第四章 Abstract Syntax，略过Burke-Fisher错误恢复

Linux Kernel Development 中文版
刚开始看

posted @ 2007-09-17 16:44 ZelluX 阅读(463) | 评论 (0) | 编辑收藏

Arch 安装手记

1. 安装后reboot，选择Arch Linux Fallback，出现登录提示后键盘无响应，按Num Lock键指示灯也没有变化，此时拔掉键盘再插上即可。

2. Network IP, Gateway, Mask设置都正常，但无法ping到网关，翻了wiki后终于发现是Windows在捣鬼
Windows中关机后会自动将Realtek 8168 8169 8101 8111网卡禁用，直到再次进入Windows后才会恢复
解决方法是在设备管理器->网络适配器->Realtek xxx->高级中把"Wake-On-Lan After Shutdown"设为Enable

3. 几个pacman常用的参数
pacman -Sy 更新本地包的数据库
pacman -S package_name1 package_name2 ... 安装包
pacman -S extra/package_name 指定安装某个版本的包
pacman -Su 更新所有安装的包，通常和Sy参数合并为-Syu
pacman -Ss 搜索
pacman -U *.pkg.tar.gz 安装本地包

4. 好不容易+莫名奇妙地配好了显卡，startx可以跑起来了
然后运行gnome-session，提示Gtk WARNING **: Cannot open display:
找了一堆解决方法，总算有一个可行的：
在~/.xinitrc中增加gnome-session，然后启动startx

posted @ 2007-09-12 16:03 ZelluX 阅读(1164) | 评论 (0) | 编辑收藏

texttable - module for creating simple ASCII tables

http://jefke.free.fr/soft/texttable/
dl: http://jefke.free.fr/soft/texttable/texttable.py
NAME
   texttable - module for creating simple ASCII tables
FILE
   /usr/lib/python2.3/site-packages/texttable.py
DESCRIPTION
   Example:
   table = Texttable()
   table.header(["Name", "Age"])
   table.set_cols_align(["l", "r"])
   table.add_row(["Xavier\nHuon", 32])
   table.add_row(["Baptiste\nClement", 1])
   table.draw()
   Result:
   +----------+-----+
   | Name | Age |
   +==========+=====+
   | Xavier | 32 |
   | Huon | |
   +----------+-----+
   | Baptiste | 1 |
   | Clement | |
   +----------+-----+
CLASSES
   exceptions.Exception
   ArraySizeError
   Texttable
   class ArraySizeError(exceptions.Exception)
   | Exception raised when specified rows don't fit the required size
   |
   | Methods defined here:
   |
   | __init__(self, msg)
   |
   | __str__(self)
   |
   | ----------------------------------------------------------------------
   | Methods inherited from exceptions.Exception:
   |
   | __getitem__(...)
   class Texttable
   | Methods defined here:
   |
   | __init__(self, max_width=80)
   | Constructor
   | - max_width is an integer, specifying the maximum width of the t
able
   | - if set to 0, size is unlimited, therefore cells won't be wrapp
ed
   |
   | add_row(self, array)
   | Add a row in the rows stack
   |
   | Cells can contain newlines.
   |
   | draw(self)
   | Draw the table
   |
   | header(self, array)
   | Specify the header of the table
   |
   | reset(self)
   | Reset the instance:
   | - reset rows and header
   |
   | set_chars(self, array)
   | Set the characters used to draw lines between rows and
   | columns.
   |
   | The array should contain 4 fields:
   |
   | [horizontal, vertical, corner, header]
   |
   | Default is set to:
   |
   | ['-', '|', '+', '=']
   |
   | set_cols_align(self, array)
   | Set the desired columns alignment
   |
   | The elements of the array should be either "l", "c" or "r"
   | - "l": column flushed left
   | - "c": column centered
   | - "r": column flushed right
   |
   | set_cols_width(self, array)
   | Set the desired columns width
   |
   | The elements of the array should be integers, specifying the
   | width of each column. For example:
   |
   | [10, 20, 5]
   |
   | set_deco(self, deco)
   | Set the table decoration. 'deco' can be a combinaison of:
   |
   | Texttable.BORDER: Border around the table
   | Texttable.HEADER: Horizontal line below the header
   | Texttable.HLINES: Horizontal lines between rows
   | Texttable.VLINES: Vertical lines between columns
   |
   | Example:
   |
   | Texttable.BORDER | Texttable.HEADER
   |
   | All of them are enabled by default.
   |
   | --------------------------------------------------------------------
--
   | Data and other attributes defined here:
   |
   | BORDER = 1
   |
   | HEADER = 4
   |
   | HLINES = 8
   |
   | VLINES = 16
DATA
   __all__ = ['Texttable', 'ArraySizeError']
   __author__ = 'Gerome Fournier <jefke(at)free.fr>'
   __credits__ = 'Jeff Kowalczyk:\n - textwrap improved import\n - ..
.
   __license__ = 'GPL'
   __revision__ = '$Id: texttable.py,v 1.3 2003/10/05 13:53:39 jef Exp je..
.
   __version__ = '0.3'
VERSION
   0.3
AUTHOR
   Gerome Fournier <jefke(at)free.fr>
CREDITS
   Jeff Kowalczyk:
   - textwrap improved import
   - comment concerning header output

posted @ 2007-09-10 23:40 ZelluX 阅读(509) | 评论 (0) | 编辑收藏

燕曦版友信息统计脚本

connector.py

import urllib, urllib2, cookielib

class MyConnector:

def __init__(self):

pass

def login(self, url):

cookie = cookielib.CookieJar()

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie))

urllib2.install_opener(opener)

str = urllib.urlencode({'id': 'guest', 'passwd': ''})

self.sock = urllib2.urlopen(url, str)

def getHTML(self, url):

self.sock = urllib2.urlopen(url)

return self.sock.read()

yanxiparser.py

from sgmllib import SGMLParser

import re

class YanxiURLParser(SGMLParser):

def reset(self):

self.result = []

SGMLParser.reset(self)

def start_a(self, attrs):

for (k, v) in attrs:

if (k == 'href' and (v.find('bbsanc') >= 0)):

self.result.append(v)

class YanxiHTMLParser:

def parse(self, html):

uid = ufrom = ubirth = ufav = ''

html = html.replace(r' ', ' ')

html = html.replace(r'<br />', '')

pattern = '\xbe\xcd\xca\xc7(.*)\xc0\xb2'

matchObject = re.search(pattern, html)

uid = matchObject.group(1)

uid = uid.strip()

pattern = '\xc0\xb4\xd7\xd4(.*)\xa3(\xac|xa1)'

matchObject = re.search(pattern, html)

ufrom = matchObject.group(1)

ufrom = ufrom.strip()

pattern = '\xcf\xb2\xbb\xb6(.*)\n'

matchObject = re.search(pattern, html)

ufav = matchObject.group(1)

ufav = ufav.strip()

pattern = '\n(.*)\xca\xc7\xce\xd2\xb5\xc4\xc9\xfa\xc8\xd5'

matchObject = re.search(pattern, html)

ubirth = matchObject.group(1)

ubirth = ubirth.strip()

return {"id" : uid, "from" : ufrom, "birth" : ubirth, "fav" : ufav}

runner.py

from connector import MyConnector

from yanxiparser import *

rootURL = 'http://yanxibbs.cn'

loginURL = 'http://yanxibbs.cn/bbslogin.php'

url1 = 'http://yanxibbs.cn/cgi-bin/bbs/bbs0an?path=%2Fgroups%2FGROUP%5F3%2F06SS%2Fbyxx%2Fbjcy'

url2 = 'http://yanxibbs.cn/cgi-bin/bbs/bbs0an?path=%2Fgroups%2FGROUP%5F3%2F06SS%2Fbyxx%2Fbjyr'

conn = MyConnector()

conn.login(loginURL)

def printInfo(url):

html = conn.getHTML(url)

urlParser = YanxiURLParser()

htmlParser = YanxiHTMLParser()

urlParser.feed(html)

for targetURL in urlParser.result:

html = conn.getHTML(rootURL + targetURL)

info = htmlParser.parse(html)

print "%(id)s\t%(from)s\t%(birth)s\t%(fav)s" % info

printInfo(url1)

printInfo(url2)

posted @ 2007-09-10 12:27 ZelluX 阅读(425) | 评论 (2) | 编辑收藏

Lex 使用笔记

1. 写了个测试脚本，逐个测试testcases目录下的各个tiger程序，并统计出错数
#!/usr/bin/python

from commands import *

countOk = countError = 0

for i in range(1,50):
    result = getstatusoutput('./lextest ../testcases/test%s.tig' % i)
    if (result[0] == 0):
        print('Test case %s: OK' % i)
        countOk += 1
    else:
        print('Error on test case %s with errorno %s' % (i, result[0]))
        countError += 1

print("Total cases: %s" % (countOk + countError))
print("Passed cases: %s" % (countOk))
print("Failed cases: %s" % (countError))

2. 状态：
定义的状态名不能与已经定义的变量/宏名冲突。在处理字符串的时候定义了个<STRING>状态，和tokens.h中的STRING冲突了，结果解析的时候被认成了BAD_TOKEN。

posted @ 2007-09-09 22:22 ZelluX 阅读(504) | 评论 (0) | 编辑收藏

[zz]一个很不错的Ubuntu配置帖子

发信人: laye (Addict to Goth >_<), 信区: Unix
标题: 我的 Ubuntu 7.04 Cookbook
发信站: 日月光华 (2007年09月09日09:19:35 星期天), 站内信件

最近装了 Ubuntu 7.04，写了一个简单的 Cookbook，记录了自己遇到的一些问题和解决
方法，希望对大家有帮助。

cook01 中文字体美化

(1) 得到 simsun.ttc tahoma.ttf tahomabd.ttf verdana.ttf verdanab.ttf
verdanai.ttf verdanaz.ttf 几个个字体文件（可以从 Windows 的 Fonts 文件夹中获
取)
(2) 用 nautilus 打开系统的字体文件夹：nautilus fonts:///
(3) 将第 (1) 步中的几个个字体复制进第 (2) 步打开的字体文件夹中
(4) 创建 /etc/fonts/local.conf 文件，内容如 [附录1] (本文最底端)
(5) 重新启动 X: Ctrl + Alt + Backspace
(6) 打开 gnome 字体设置：gnome-font-properties，设置前四个字体为 Tahoma
(7) 打开 firefox 字体设置：将 Serif 设置为 SimSun，Sans-serif 设置为 Tahoma，
Monospace 设置为 SimSun，并且允许网页自由选择字体
(8) 重新启动 X, enjoy~

点评：该优化方案用 Windows 的 SimSun (衬线) 和 Tahoma (无衬线) 作为 UI 和
Web 中的主要字体使用。并且，解决了字体替换后的几个问题：
(1) 针对中文选字混乱问题 (同一句话中的不同汉字可能用不同字体进行渲染)，我将所
有亚洲字符都强制使用 SimSun 显示。
(2) 由于 SimSun 的小字体为点阵字体，所以关闭了 8-17px 字体的抗锯齿以达到理想
的显示效果。
(3) 设置了 SimSun 和 Tahoma 的最小字体尺寸以保证显示效果

已知 Bug: SimSun 英文和数字粗体效果不佳，可以下载网友制作的 SimSunbd 拷贝入
字体目录（SimSunbd 的英文和数字用的好像是 Tahoma）

cook02 连接 windows 的远程桌面

安装 rdesktop:
sudo apt-get install rdesktop

使用：
rdesktop ip
rdesktop -f ip (全屏)

cook03 使 ntfs 分区可写

安装 ntfs-config:
sudo apt-get install ntfs-config

配置：
sudo ntfs-config
勾选两个选项 (或根据需要勾选)

cook04 访问 Windows 共享文件夹

执行：nautilus-connect-server
填写你要访问的 Windows 共享文件夹的相关信息，这个文件夹就会被 mount 到你的
Desktop 下。同样的方法可以 mount FTP 文件夹等等。

cook05 使 Windows 能访问 ubuntu 的文件 (samba)

执行：shares-admin
按需要设置

cook06 播放各种视频

(1) 下载 mplayer 的 Codec Package：
http://www.mplayerhq.hu/design7/dload.html#binary_codecs
(2) 将下载的 Codec Package 解压到 /usr/lib/win32
(3) 安装 mplayer 或各种 mplayer 的前端程序，我用的是 mplayer 的一个前端：
smplayer
(4) 如果你的 mplayer 的播放画面不正常，请检查你的 Video Output Driver 是否为
x11
(5) 若要显示中文字幕，需要在播放器内设置字幕用字体和字幕编码

cook07 使用 fcitx 替换 scim

由于使用习惯和兼容性原因，许多人不得不使用 fcitx，安装方法如下：

sudo apt-get install im-switch fcitx
im-switch -s fcitx -z default

如果你要在 en 的 locale 下使用 fcitx，请将 /etc/gtk-2.0/gtk.immodules 中的：

“xim” “X Input Method” “gtk20″ “/usr/share/locale” “ko:ja:th:zh”

替换为

“xim” “X Input Method” “gtk20″ “/usr/share/locale” “en:ko:ja:th:zh”

cook08 Panel 中的好东东

ubuntu 桌面上下各有一个 Panel。其实，我们可以自定义 Panel 上面的内容，比如说
：
天气预报，本本的电量显示、亮度调节，CPU 频率监视，系统负载监视，还有一个强制
终止程序的按钮 (Force Quit)等等。
你也可以把自己建立的 Laucher 拖动到 Panel 上，当然，你也可以在别的位置建立自
己的 Panel。

cook09 使用 HP 打印机

ubuntu 自带了 HP 系列打印机的驱动 oo2zjs，但奇怪的是，这个驱动居然是不能用的
。
用户需要卸载掉 ubuntu 自带的 oo2zjs，安装新的 oo2zjs。

具体过程如下：

确认编译环境已安装：
sudo apt-get install build-essential

下载 oo2zjs：
wget -O foo2zjs.tar.gz http://foo2zjs.rkkda.com/foo2zjs.tar.gz

解压：
tar zxf foo2zjs.tar.gz
cd foo2zjs

卸载原有 foo2zjs：
make uninstall

编译，安装：
make
./getweb 1020 这里是你的打印机型号
sudo make install install-hotplug cups

在 ubuntu 的打印机管理中添加你的打印机 (不要选 suggest 的那个驱动):
sudo gnome-cups-manager

最后，重启 cups:
sudo make cups

cook10 开启 Vim 的“简易模式”

ubuntu 中 Vim 只有 console 的版本，且是传统 Unix 的行为模式，即在 INSERT
MODE 下，方向键和退格键是不起作用的。可以手动开启“类 Windows”模式，即在
COMMAND MODE 下输入：

:behave mswin

[附录1]

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">

<fontconfig>

    
    <match target="font">
        <test compare="contains" name="lang" >
            <string>zh-cn</string>
            <string>zh-tw</string>
            <string>ja</string>
            <string>ko</string>
        </test>
        <edit name="family"><string>SimSun</string></edit>
    </match>

    
    <match target="font" >
        <edit mode="assign" name="antialias"><bool>true</bool></edit>
        <edit mode="assign"
name="hintstyle"><const>hintslight</const></edit>
        <edit mode="assign" name="hinting"><bool>true</bool></edit>
        <edit mode="assign" name="autohint"><bool>false</bool></edit>
    </match>

    
    <match target="font" >
        <test compare="more_eq" name="pixelsize"
qual="any"><double>8</double></test>
        <test compare="less_eq" name="pixelsize"
qual="any"><double>17</double></test>
        <edit mode="assign" name="antialias"><bool>false</bool></edit>
    </match>

    
    <match target="font" >
        <test name="family" qual="any">
            <string>SimSun</string>
            <string>NSimSun</string>
            <string>Tahoma</string>
        </test>
        <test compare="more_eq" name="pixelsize"><int>8</int></test>
        <test compare="less_eq" name="pixelsize"><int>12</int></test>
        <edit compare="eq" name="pixelsize"><int>12</int></edit>
    </match>

</fontconfig>

posted @ 2007-09-09 10:52 ZelluX 阅读(702) | 评论 (0) | 编辑收藏

Lexical Analysis - Finite Automata

DFA(deterministic finite automaton): 从同一状态出发的边都标记有不同的符号

可以把一个DFA用一个转置矩阵(transition matrix)表示，矩阵的第i行记录状态i向其他状态跳转的情况，edges[i][j]表示状态i时吃掉一个j符号后跳转到edges[i][j]状态。

DFA的一个好处就是可以识别最长的匹配，比如对于IF8这个字符串，可以被识别为一个变量而不是一个关键字加数字。
处理方法是保留最后一次自动机的终结状态Last-Final及其相对应的字符位置Input-Position-at-Last-Final。
每次进入一个终结状态，就更新这两个变量。
遇到死路(dead state, a nonfinal state with no output transitions)，通过这两个变量回复到上一个终结状态。

NFA(nondeterministic finite automaton): 从同一状态出发的边可能标记有相同的符号
由于从同一状态出发选择的路径可能有多条，非确定性自动机总是需要进行“猜测”，而且总要做出正确的猜测。

posted @ 2007-09-06 22:09 ZelluX 阅读(412) | 评论 (0) | 编辑收藏

CSAPP - Explicitly Blocking Signals

看了半天总算对这节有了个大致的感觉，首先看几个和sigprocmask相关的函数：

#include <signal.h>

int sigprocmask(int how, const sigset_t *set, sigset_t *oldset);
int sigemptyset(sigset_t *set);
int sigfillset(sigset_t *set);
int sigaddset(sigset_t *set, int signum);
int sigdelset(sigset_t *set, int signum); // Return: 0 if OK, -1 on error
int sigismember(const sigset_t *set, int signum); // Return: 1 if member, 0 if not, -1 on error

sigprocmask用于改变当前blocked signals的集合，how参数指定了具体的行为：
SIG_BLOCK 把set中的信号添加到blocked列表(blocked = blocked | set)
SIG_UNBLOCK 把set中的信号从blocked列表中移出(blocked = blocked & ~set)
SIG_SETMASK blocked = set

另外，如果oldset的值不是NULL的话，之前的blocked列表会保存在oldset中。

而sigaddset sigdelset sigfillset sigemptyset都是用于操作sigset_t列表的函数。

sigprocmask适用于在父子进程间同步的情况，以一个典型的UNIX shell程序为例，父进程需要在一个job list中记录它所有的子进程，当父进程创建一个子进程时，它把子进程加入到job list中；当父进程reap一个子进程时，就从job list中移出这个进程。
如果不同步父子进程，有可能发生这种情况：
1. 父进程执行fork函数，内核调度新创建的子进程替换父进程运行
2. 在父进程能够再次运行前，子进程终止，成为一个zombie，内核发送SIGCHLD信号给父进程
3. 父进程可以运行前，内核发现了未处理的(pending)SIGCHLD信号，让它由父进程的handler处理
4. handler reap了终止的进程，调用deletejob函数，实际上没有任何作用，因为父进程还没有把该子进程加入列表
5. handler完成后，内核继续运行父进程，后者调用fork完成后继续，错误地把不存在的子进程加入了job list

posted @ 2007-09-05 12:22 ZelluX 阅读(365) | 评论 (0) | 编辑收藏

O(n)时间求出最小的k个数

一种做法是用最差情况下复杂度也是O(n)的算法求出第k大的数，然后把这个数作为pivot进行一次paritition，再排序该数左边的部分。复杂度为O(n + klgk)

http://en.wikipedia.org/wiki/Selection_algorithm

另外，CLRS上Selection in worst-case linear time算法实际上对in expected linear time在选数时做了一个优化，这样在最差情况下也有O(n)的复杂度了，实际应用中没什么用 (thx to Peter大牛 ^_^)

posted @ 2007-09-04 18:55 ZelluX 阅读(485) | 评论 (0) | 编辑收藏

CLRS - Lower bounds for sorting

搬到张江了

1. 比较排序法最差情况下至少需要omega(n lgn)
证明：
通过决策树(decision tree)分析比较排序，n!种可能情况都要被该决策树的叶子覆盖。
设树深度为h，有
n! <= l <= 2^h得h >= lg(n!) = omega(n lgn)

2. 计数排序
很简单的排序算法，不过后面的C数组的运用比较技巧。
for i = 1 to k
    do c[i] = 0
for j = 1 to length(a)
    do c[a[j]] = c[a[j]] + 1
// c 数组记录了各个元素的出现次数
for i = 1 to k
    do c[i] = c[i] + c[i - 1]
// c[i] 记录了小于或等于i的元素个数
for j = length(a) downto 1
    do b[c[a[j]]] = a[j]
        c[a[j]] = c[a[j]] - 1
复杂度： k = O(n)时，复杂度为theta(n)

posted @ 2007-09-02 22:22 ZelluX 阅读(336) | 评论 (0) | 编辑收藏

阅读进度

Arch 安装手记

texttable - module for creating simple ASCII tables

燕曦版友信息统计脚本

Lex 使用笔记

[zz]一个很不错的Ubuntu配置帖子

Lexical Analysis - Finite Automata

CSAPP - Explicitly Blocking Signals

O(n)时间求出最小的k个数

CLRS - Lower bounds for sorting

日历

常用链接

留言簿(21)

随笔分类(390)

随笔档案(389)

文章档案(7)

相册

15ers

友情链接

收藏夹

搜索

积分与排名

最新随笔

最新评论

阅读排行榜

评论排行榜