2006年8月31日随笔档案 - 我的隐式生活（My Implicit Life）

2006年8月31日 #

近期写了个电子书的C/S模式的下载工具，一个server端，一个client端。

目的就是想在公司能很方便的访问家里那些收集很久电子书，方便查阅。

用了1，2个星期，虽然写的很烂，但是没有用任何第三方的产品（server or db）。

现在里面的书籍已经接近200本了。

注：server就用了家里的adsl，所以速度慢，关闭不定时。毕竟玩玩嘛。

有兴趣的朋友先装个jdk1.5。再运行下面压缩包里的exe文件执行即可。

点此下载

User ID: blogjava
Password: blogjava

posted @ 2006-10-15 13:21 marco 阅读(3513) | 评论 (9) | 编辑收藏

JAVA Container Class

Java Collection Framwork中的类的确是最重要的基础api，实现任何算法，基本上都很难离开它。

因此理解这堆“集合（Collection）类”很有必要。声明一下，以前一直都是叫它们集合类，但是好像Think In Java的作者鄙视了这个说法，严格的说应该叫Container类，而后看了它整整一章书以后，觉得还是人家说的有道理。

它说这个container类库，包含了两大类，Collection和Map，而Collection又可以分为List和Set。当然这些抽象概念都被定义成了接口。

话说，这样的分类的确是严格按照类之间的继承关系来说得，但是俺总觉得很别扭，真动手的时候，还是很难选择。当然，Anytime and Anywhere使用ArrayList绝对都能解决问题，但这样做毕竟太农民了一点。

所以，我自己有了一些想法。先回归到最基本最基本的数据结构的层面，管你是Collection还是Container，反正描述的都是一堆东西吧。数据结构第一章讲了一个结构：在物理上连续分配空间的顺序结构，叫顺序表（希望记性是好的），而离散分配空间的，应该叫做链表，最常用的就是单链表。这两个东西，其实就是很多复杂数据结构的基础，还记得吗，当时就是讲完这些东西，才开始讲栈、队列、二叉树、有向无向图的。所以，这个顺序结构是很基础的。而在JAVA中，顺序表对应的就是List接口，而一般顺序表就是ArrayList（有效进行随机index查找）；而单链表就是LinkedList（有效进行插入和删除），两个的优劣当年都讲烂了，这里就不说了。

有了这两个结构以后，JAVA就不提供Stack和Queue单独的类了，因为，用户可以用上面两个类轻易的去实现。

那Set和Map有怎么跟List连上关系呢？

我认为可以把它们看成是无序和单一的List（Map只是两个有映射关系的List罢了）。

Set和Map无序和单一的特性，决定了它们天大的需求就是根据关键字（元素对象）检索。so，为了效率，必须hash。

有了HashSet和HashMap。

同时，如果非要保持住元素的顺序，有了LinkedHashSet、LinkedHashMap。

结论：

假如你的需求是
1：往Container中放的对象是无序且单一的；
2：经常要检索。
用HashSet或HashMap吧。

ps：这两个条件其实是一回事，因为如果是不单一的话，你去检索它干嘛。

如果进而需要保持元素的顺序，不要让他顺便iteration，那就选择LinkedHashSet和LinkedHashMap。

假如你的需求不满足以上1&2，那你放心，List肯定能帮你解决，你只要稍微想一下是ArrayList好还是LinkedList好。

题外话：

关于Hash，务必记得要让自己的元素对象override hashCode()和 equles() 方法，要不你直接可以洗了睡。

关于所有这些Container，务必记得有个辅助类叫Interator，遍历尽量要用它。

关于一些老的Stack、Vector、HashTable，听说以后不要用了哦。收到啦！！

posted @ 2006-09-20 16:53 marco 阅读(2352) | 评论 (0) | 编辑收藏

Java 字符串操作小节（Regular Expression）

任何信息，基本都是以文字的形式传播和记录下来的。

在计算机中，文字就是字符的集合，也就是字符串，C就是因为对字符串设计的不好，才那么容易溢出。而别的一些高级语言，对于这个进行了很多的改进。

编程的人由于技术方向和应用方向的不同，日常编程的内容差距很大。但是对于字符串的处理，那可是永远都避不开的工作。

昨天跑步的时候，想了一下，对于字符串的操作有那么多（search，match，split，replace），感觉很烦杂，能不能抓住这些操作的一个基本集？

不知道对不对，反正想出来了一个，这个基本操作就是search，这里的search的意思是：在输入串中找到目标串的开始位置（start index），和结束位置（end index）。

有了这个基本集，别的操作都很好衍生出来：

局部match：其实就是要求search操作至少返回一个start index。

全match：其实要求search操作的至少返回一个start index，并且start index要为零，end index要为输入串的全长。

split：其实就是search操作之后，把前一个end index和当前的start index之间的字符串截出来而已。

replace：其实就是search操作之后，把start index和end index之间的字符串换成另外的而已。

所以，归根到底，都是一个search操作的拓展罢了。这么一想，感觉清晰多了。

这么一来，API对search的能力支持的好坏和效率高低是衡量字符串操作功能的标准，当然，如果有直接支持match，split，replace操作的话就更好了。

java对字符串search的支持，最基本的就是下面的String的indexOf方法：

int indexOf(String str)
Returns the index within this string of the first occurrence of the specified substring.

这里我想说的是，很多时候我们所谓要search的目标串，根本就不是固定单一的，而是变化多样的。如果只有一两种情况，最多用两次上面的方法呗。但是有些情况是近乎不可能罗列的，例如，我们讲的代表email的字符串，我们不可能遍历它吧。

所以，需要一种能够通用表达字符串格式的语言。这就是Regular Expression（re）。

假如上面方法indexOf的str参数能支持re做为参数的话，那对于这种多样的search也可以用上面的方法了。

可惜，indexOf不支持re作为参数。

so，以下就介绍java api中可以用re作为参数的字符串操作方法（参数中的regex就是re）。

－－－－－－－－－－－－－－－－－－－－－>>
String类的：

全match操作：
boolean matches(String regex)
Tells whether or not this string matches the given regular expression.

全replace操作：
String replaceAll(String regex, String replacement)
Replaces each substring of this string that matches the given regular expression with the given replacement.

首个replace操作：
String replaceFirst(String regex, String replacement)
Replaces the first substring of this string that matches the given regular expression with the given replacement.

全split操作：
String[] split(String regex)
Splits this string around matches of the given regular expression.

有限制数的split操作：
String[] split(String regex, int limit)
Splits this string around matches of the given regular expression.

<<－－－－－－－－－－－－－－－－－－－－－

可惜啊，可惜，可惜java的String类里面没有可以支持re的search方法，那如果要用re来search，只好使用java中专门的re类库。

java中的re类库主要就两个类，一个叫Pattern，顾名思义，代表re的类。一个叫Matcher类，反映当前match状况的类（如存放了当前search到的位置，匹配的字符串等等信息）。

一般在构造中，“re的表达式”作为参数传递入Pattern类，“输入串（待过滤串）”作为参数传递入Matcher类。

然后使用Matcher类的字符串search方法就可以了。Matcher真正提供search功能的API叫find。下面列出。
－－－－－－－－－－－－－－－－－－－－－>>
Matcher类search操作相关的方法：

boolean lookingAt()
Attempts to match the input sequence, starting at the beginning, against the pattern.

boolean matches()
Attempts to match the entire input sequence against the pattern.

boolean find()
Attempts to find the next subsequence of the input sequence that matches the pattern.

String group()
Returns the input subsequence matched by the previous match.

<<－－－－－－－－－－－－－－－－－－－－－

前三个都是search方法，返回成功与否。第四个是返回当前search上的字符串。

ok，至此。使用re的search操作也有眉目了。

当然，Pattern和Matcher也包含直接使用re进行的match，split，replace操作。

－－－－－－－－－－－－－－－－－－－－－>>
Patter类别的字符串操作方法

全match操作：
static boolean matches(String regex, CharSequence input)
Compiles the given regular expression and attempts to match the given input against it.

全split操作：
String[] split(CharSequence input)
Splits the given input sequence around matches of this pattern.

有限制数的split操作：
String[] split(CharSequence input, int limit)
Splits the given input sequence around matches of this pattern.

Matcher类别的字符串操作方法

全replace操作：
String replaceAll(String replacement)
Replaces every subsequence of the input sequence that matches the pattern with the given replacement string.

首个replace操作：
String replaceFirst(String replacement)
Replaces the first subsequence of the input sequence that matches the pattern with the given replacement string.

动态replace（replacement可以根据被替代的字符串变化而变化）
Matcher appendReplacement(StringBuffer sb, String replacement)
Implements a non-terminal append-and-replace step.

StringBuffer appendTail(StringBuffer sb)
Implements a terminal append-and-replace step.

<<－－－－－－－－－－－－－－－－－－－－－

总结：
当必须使用re的时候，search操作就要用到Pattern，Matcher，当然动态的replace操作也要用到这两个类。而别的match，replace，split操作，可以使用pattern，Matcher，当然也可以直接使用String，推荐还是用回咱们的String吧。

注：以上都是看jdk1.4以上的文档得出的结论，以前版本不能用不负责任。

posted @ 2006-08-31 15:13 marco 阅读(2719) | 评论 (0) | 编辑收藏

我的隐式生活（My Implicit Life）

公告

常用链接

留言簿(5)

随笔分类(21)

随笔档案(11)

文章分类(1)

文章档案(1)

收藏夹(1)

Friend's Blog

My MSN Space

我喜爱的电子书

搜索

积分与排名

最新评论

阅读排行榜

评论排行榜