2005年12月23日

不同方法遍历列表的时间效率

在Java高效编程里面看到变量一个ArrayList的时候，有两种方式：
假设a是个ArrayList

1、 for (int i=0;i<a.size();i++) {
2、 for (int i=0,n=a.size();i<n;i++) {

带着点怀疑我做了一下试验，的确是方法2快一点的，估计是a.size()方法里面花费了一点多余的时间。后来我想到jdk 1.5开始还有一种遍历的for/each方法，我做了一下比较，结果有点惊讶。

源程序如下

import java.util.ArrayList;
2

public class ProfileArrayList {
4

public static void main(String[] args) {
6

ArrayList<String> s=new ArrayList<String>();
7

for (int i=0;i<15000;i++) {
8

s.add(""+System.currentTimeMillis());
9

}
10

System.out.println("Start ");
11

testOne(s);
12

testTwo(s);
13

testThree(s);
14

System.out.println("End ");
15

}
16

private static void testOne(ArrayList<String> a) {
18

int j=0;String s=null;
19

for (int i=0;i<a.size();i++) {
20

s=a.get(i);
21

j++;
22

}
23

}
24

private static void testTwo(ArrayList<String> a) {
26

int j=0;
27

String s=null;
28

for (int i=0,n=a.size();i<n;i++) {
29

s=a.get(i);
30

j++;
31

}
32

}
33

private static void testThree(ArrayList<String> a) {
35

int j=0;
36

for (String s : a) {
37

j++;
38

}
39

}
40

}
42

通过Profiling工具看结果：
方法运行时间
testOne 0.055764
testTwo 0.043821
testThres 0.132451

也就是说，jdk 1.5的for/each循环是最慢的。有点不相信。开头觉得是因为赋值造成的，但后来在另两个方法里面加上赋值语句，依然是for/each最慢。比较有趣的结果。

从代码清晰角度，用for/each消耗多一点点时间似乎也无所谓。但是，另两种代码也不见得“不清晰”，呵呵。看着办了。

posted @ 2006-03-03 12:00 Raymond的Java笔记阅读(508) | 评论 (0) | 编辑收藏

使用JMeter进行压力测试

JMeter是apache的jakarta上面的项目，用于软件的压力测试（Load Test），不但可以对HTTP，也可以对数据库（通过JDBC）、FTP、Web Service、Java 对象等等进行压力测试。

项目地址：http://jakarta.apache.org/jmeter

使用：运行bin目录下的jmeterw.bat，运行jmeter.bat也可以，不过就会有一个命令窗口显示。

要提醒一下的是jmeter根据当前系统的locale显示菜单的语言，为了方便想设置回英文的话，可以修改jmeter.properties文件，设置language=en （我下载的2.1.1版本把“退出”误译为“推出”，怎么看都不顺眼

）

使用：

JMeter的测试计划（Test Plan）呈树状结构，树里面有多种元素类型，树状结构的元素之间有的是有继承关系的（其原理有点类似log4j）。下面简述一下元素类型：

1、ThreadGroup
      顾名思义就是线程组，测试必须有一个ThreadGroup元素作为基础（否则就没有测试线程在跑了），这个元素可以配置跑多少个线程、每个线程循环多少次，所有线程数的总启动时间（Ramp-up period）等等。

2、Controller
     包括Logical Controller和Sampler，前者用来作一些逻辑上的控制，例如轮换、条件、循环等等。Sampler就是真正“干活”的“取样器”，例如“HTTP Request”，就是拿来执行一个HTTP请求的。

3、Listener
    Listener对请求过程进行监听，可以简单理解为获取结果的东东。例如Simple Data Writer，可以把结果写到一个文本文件里（其实所有Listener都可以写数据到文件里），还有View Results in Table，就是把结果显示在表格里。

4、 Timer
    用来控制执行流程中的时间延迟等功能。

5、 Assertion
    断言，加到Sampler里面可以对返回的结果进行判断，例如判断HTTP返回结果里面是否含有某个字符串。如果断言为真，JMeter会标记请求为成功，否则标记为失败。

6、 Configuration Element
    配置用的元素，很有用。由于测试计划是树状和有继承关系的，可以在高层次指定一个Configuration Element，低层次的相关Sampler如果没有显式地指定配置，就继承高层次的配置信息。（跟log4j很像吧？）

7、 Pre-Processor/Post-Processor Elements
   用来在Sampler运行前和运行后作一些预处理和后处理工作的。例如动态修改请求的参数（预处理），从返回信息里面提取信息（后处理）等等。

举例：要做一个最简单的HTTP压力测试：用10个线程访问一个URL，每个线程访问100次。
做法：
1、在Test Plan下面加一个Thread Group，配置里面，线程数填10，循环次数填100
2、在Thread Group下面加一个HTTP Request，这是一个Sampler，在它的配置里面填写主机信息，端口、协议、路径、参数等信息
3、在HTTP Request下面加一个View Results in Table，如果你想把记录记到文件，则填写文件路径。
4、保存一些这个Test Plan，就可以选择Run菜单下面的Run来运行了。直到Run菜单项从灰色变回黑色，就表示运行完了。在View Results in Table下面，你可以看到运行结果。

关于元素的详细描述可以参考官方文档。

JMeter功能很丰富的，还有很强的扩展能力，而且又是免费，值得研究使用。

posted @ 2006-03-01 10:04 Raymond的Java笔记阅读(1590) | 评论 (0) | 编辑收藏

使用TPTP和eclipse进行Profiling（剖析）－简介

本文只作很简要介绍，可视作备忘参考。

TPTP是eclipse官方的profiling插件，初步使用下感觉功能强大。

下载安装：在http://www.eclipse.org/tptp/下载，我选择All－Runtime，然后像其它插件一样解压到eclipse的目录，然后允许eclipse -clean来刷新一把。

使用：
常用的profiling简单来讲就对程序运行进行记录，然后从数据中分析哪些方法运行时间长，哪些对象吃内存多，哪些类的实例多等等。一个比较好的使用入门sample在这里： http://www.eclipse.org/tptp/home/documents/tutorials/profilingtool/profilingexample_32.html 我就不罗嗦了。

值得多讲的是Remote Profiling，就是远程剖析。实现的原理是在远程机器上运行一个代理进程，要被远程剖析的程序或者Application Server启动的时候加一个JVM参数来识别这个代理进程，两者相互作用，代理就可以把收集到的信息发给在远程的一方（就是运行着eclipse的一方）。

因此要实现Remote Profiling，还要在目标机器上装一个agent。 -->

下载安装：http://www.eclipse.org/tptp/home/downloads/drops/TPTP-4.0.1.html　选择对应操作系统的Agent Controller下载，选择Runtime即可。

下载后，阅读依照getting_started.html的说明来安装即可，这里简述一下：
1、把它的bin目录放到PATH里面
2、运行一下SetConfig来设置参数，注意如果想让除本地localhost意外所以机器都访问的话，要注意设置Network Access Mode，默认是localhost的。
3、运行RAStart来启动代理（Linux下）
4、服务器端程序（例如tomcat）启动的JVM参数里面加入-XrunpiAgent:server=enabled即可（还有其它参数值参见文档）
5、然后就可以在远程用eclipse来启动一个Profiling进程来attach到这个agent controller了。效果和在eclipse里面直接profile应用程序一样。

posted @ 2006-02-27 14:14 Raymond的Java笔记阅读(5654) | 评论 (2) | 编辑收藏

Volatile Fields

Sometimes, it seems excessive to pay the cost of synchronization just to read or write an instance field or two. After all, what can go wrong? Unfortunately, with modern processors and compilers, there is plenty of room for error:

Computers with multiple processors can temporarily hold memory values in registers or local memory caches. As a consequence, threads running in different processors may see different values for the same memory location!
Compilers can reorder instructions for maximum throughput. Compilers won't choose an ordering that changes the meaning of the code, but they make the assumption that memory values are only changed when there are explicit instructions in the code. However, a memory value can be changed by another thread!

If you use locks to protect code that can be accessed by multiple threads, then you won't have these problems. Compilers are required to respect locks by flushing local caches as necessary and not inappropriately reordering instructions. The details are explained in the Java Memory Model and Thread Specification developed by JSR 133 (see http://www.jcp.org/en/jsr/detail?id=133). Much of the specification is highly complex and technical, but the document also contains a number of clearly explained examples. A more accessible overview article by Brian Goetz is available at http://www-106.ibm.com/developerworks/java/library/j-jtp02244.html.

NOTE

Brian Goetz coined the following "synchronization motto": "If you write a variable which may next be read by another thread, or you read a variable which may have last been written by another thread, you must use synchronization."

The volatile keyword offers a lock-free mechanism for synchronizing access to an instance field. If you declare a field as volatile, then the compiler and the virtual machine take into account that the field may be concurrently updated by another thread.

For example, suppose an object has a boolean flag done that is set by one thread and queried by another thread. You have two choices:

Use a lock, for example:
```
public synchronized boolean isDone() { return done; }
private boolean done;
```
(This approach has a potential drawback: the isDone method can block if another thread has locked the object.)

Declare the field as volatile:

public boolean isDone() { return done; }
private volatile boolean done;

Of course, accessing a volatile variable will be slower than accessing a regular variablethat is the price to pay for thread safety.

NOTE

Prior to JDK 5.0, the semantics of volatile were rather permissive. The language designers attempted to give implementors leeway in optimizing the performance of code that uses volatile fields. However, the old specification was so complex that implementors didn't always follow it, and it allowed confusing and undesirable behavior, such as immutable objects that weren't truly immutable.

In summary, concurrent access to a field is safe in these three conditions:

The field is volatile.
The field is final, and it is accessed after the constructor has completed.
The field access is protected by a lock.

posted @ 2006-02-19 15:58 Raymond的Java笔记阅读(403) | 评论 (0) | 编辑收藏

The Joel Test: 12 Steps to Better Code

By Joel Spolsky
Wednesday, August 09, 2000
Printer Friendly Version

Have you ever heard of SEMA? It's a fairly esoteric system for measuring how good a software team is. No, wait! Don't follow that link! It will take you about six years just to understand that stuff. So I've come up with my own, highly irresponsible, sloppy test to rate the quality of a software team. The great part about it is that it takes about 3 minutes. With all the time you save, you can go to medical school.

The Joel Test

Do you use source control?
Can you make a build in one step?
Do you make daily builds?
Do you have a bug database?
Do you fix bugs before writing new code?
Do you have an up-to-date schedule?
Do you have a spec?
Do programmers have quiet working conditions?
Do you use the best tools money can buy?
Do you have testers?
Do new candidates write code during their interview?
Do you do hallway usability testing?

The neat thing about The Joel Test is that it's easy to get a quick yes or no to each question. You don't have to figure out lines-of-code-per-day or average-bugs-per-inflection-point. Give your team 1 point for each "yes" answer. The bummer about The Joel Test is that you really shouldn't use it to make sure that your nuclear power plant software is safe.

A score of 12 is perfect, 11 is tolerable, but 10 or lower and you've got serious problems. The truth is that most software organizations are running with a score of 2 or 3, and they need serious help, because companies like Microsoft run at 12 full-time.

Of course, these are not the only factors that determine success or failure: in particular, if you have a great software team working on a product that nobody wants, well, people aren't going to want it. And it's possible to imagine a team of "gunslingers" that doesn't do any of this stuff that still manages to produce incredible software that changes the world. But, all else being equal, if you get these 12 things right, you'll have a disciplined team that can consistently deliver.

1. Do you use source control?
I've used commercial source control packages, and I've used CVS, which is free, and let me tell you, CVS is fine. But if you don't have source control, you're going to stress out trying to get programmers to work together. Programmers have no way to know what other people did. Mistakes can't be rolled back easily. The other neat thing about source control systems is that the source code itself is checked out on every programmer's hard drive -- I've never heard of a project using source control that lost a lot of code.

2. Can you make a build in one step?
By this I mean: how many steps does it take to make a shipping build from the latest source snapshot? On good teams, there's a single script you can run that does a full checkout from scratch, rebuilds every line of code, makes the EXEs, in all their various versions, languages, and #ifdef combinations, creates the installation package, and creates the final media -- CDROM layout, download website, whatever.

If the process takes any more than one step, it is prone to errors. And when you get closer to shipping, you want to have a very fast cycle of fixing the "last" bug, making the final EXEs, etc. If it takes 20 steps to compile the code, run the installation builder, etc., you're going to go crazy and you're going to make silly mistakes.

For this very reason, the last company I worked at switched from WISE to InstallShield: we required that the installation process be able to run, from a script, automatically, overnight, using the NT scheduler, and WISE couldn't run from the scheduler overnight, so we threw it out. (The kind folks at WISE assure me that their latest version does support nightly builds.)

3. Do you make daily builds?
When you're using source control, sometimes one programmer accidentally checks in something that breaks the build. For example, they've added a new source file, and everything compiles fine on their machine, but they forgot to add the source file to the code repository. So they lock their machine and go home, oblivious and happy. But nobody else can work, so they have to go home too, unhappy.

Breaking the build is so bad (and so common) that it helps to make daily builds, to insure that no breakage goes unnoticed. On large teams, one good way to insure that breakages are fixed right away is to do the daily build every afternoon at, say, lunchtime. Everyone does as many checkins as possible before lunch. When they come back, the build is done. If it worked, great! Everybody checks out the latest version of the source and goes on working. If the build failed, you fix it, but everybody can keep on working with the pre-build, unbroken version of the source.

On the Excel team we had a rule that whoever broke the build, as their "punishment", was responsible for babysitting the builds until someone else broke it. This was a good incentive not to break the build, and a good way to rotate everyone through the build process so that everyone learned how it worked.

Read more about daily builds in my article Daily Builds are Your Friend.

4. Do you have a bug database?
I don't care what you say. If you are developing code, even on a team of one, without an organized database listing all known bugs in the code, you are going to ship low quality code. Lots of programmers think they can hold the bug list in their heads. Nonsense. I can't remember more than two or three bugs at a time, and the next morning, or in the rush of shipping, they are forgotten. You absolutely have to keep track of bugs formally.

Bug databases can be complicated or simple. A minimal useful bug database must include the following data for every bug:

complete steps to reproduce the bug
expected behavior
observed (buggy) behavior
who it's assigned to
whether it has been fixed or not

If the complexity of bug tracking software is the only thing stopping you from tracking your bugs, just make a simple 5 column table with these crucial fields and start using it.

For more on bug tracking, read Painless Bug Tracking.

5. Do you fix bugs before writing new code?
The very first version of Microsoft Word for Windows was considered a "death march" project. It took forever. It kept slipping. The whole team was working ridiculous hours, the project was delayed again, and again, and again, and the stress was incredible. When the dang thing finally shipped, years late, Microsoft sent the whole team off to Cancun for a vacation, then sat down for some serious soul-searching.

What they realized was that the project managers had been so insistent on keeping to the "schedule" that programmers simply rushed through the coding process, writing extremely bad code, because the bug fixing phase was not a part of the formal schedule. There was no attempt to keep the bug-count down. Quite the opposite. The story goes that one programmer, who had to write the code to calculate the height of a line of text, simply wrote "return 12;" and waited for the bug report to come in about how his function is not always correct. The schedule was merely a checklist of features waiting to be turned into bugs. In the post-mortem, this was referred to as "infinite defects methodology".

To correct the problem, Microsoft universally adopted something called a "zero defects methodology". Many of the programmers in the company giggled, since it sounded like management thought they could reduce the bug count by executive fiat. Actually, "zero defects" meant that at any given time, the highest priority is to eliminate bugs before writing any new code. Here's why.

In general, the longer you wait before fixing a bug, the costlier (in time and money) it is to fix.

For example, when you make a typo or syntax error that the compiler catches, fixing it is basically trivial.

When you have a bug in your code that you see the first time you try to run it, you will be able to fix it in no time at all, because all the code is still fresh in your mind.

If you find a bug in some code that you wrote a few days ago, it will take you a while to hunt it down, but when you reread the code you wrote, you'll remember everything and you'll be able to fix the bug in a reasonable amount of time.

But if you find a bug in code that you wrote a few months ago, you'll probably have forgotten a lot of things about that code, and it's much harder to fix. By that time you may be fixing somebody else's code, and they may be in Aruba on vacation, in which case, fixing the bug is like science: you have to be slow, methodical, and meticulous, and you can't be sure how long it will take to discover the cure.

And if you find a bug in code that has already shipped, you're going to incur incredible expense getting it fixed.

That's one reason to fix bugs right away: because it takes less time. There's another reason, which relates to the fact that it's easier to predict how long it will take to write new code than to fix an existing bug. For example, if I asked you to predict how long it would take to write the code to sort a list, you could give me a pretty good estimate. But if I asked you how to predict how long it would take to fix that bug where your code doesn't work if Internet Explorer 5.5 is installed, you can't even guess, because you don't know (by definition) what's causing the bug. It could take 3 days to track it down, or it could take 2 minutes.

What this means is that if you have a schedule with a lot of bugs remaining to be fixed, the schedule is unreliable. But if you've fixed all the known bugs, and all that's left is new code, then your schedule will be stunningly more accurate.

Another great thing about keeping the bug count at zero is that you can respond much faster to competition. Some programmers think of this as keeping the product ready to ship at all times. Then if your competitor introduces a killer new feature that is stealing your customers, you can implement just that feature and ship on the spot, without having to fix a large number of accumulated bugs.

6. Do you have an up-to-date schedule?
Which brings us to schedules. If your code is at all important to the business, there are lots of reasons why it's important to the business to know when the code is going to be done. Programmers are notoriously crabby about making schedules. "It will be done when it's done!" they scream at the business people.

Unfortunately, that just doesn't cut it. There are too many planning decisions that the business needs to make well in advance of shipping the code: demos, trade shows, advertising, etc. And the only way to do this is to have a schedule, and to keep it up to date.

The other crucial thing about having a schedule is that it forces you to decide what features you are going to do, and then it forces you to pick the least important features and cut them rather than slipping into featuritis (a.k.a. scope creep).

Keeping schedules does not have to be hard. Read my article Painless Software Schedules, which describes a simple way to make great schedules.

7. Do you have a spec?
Writing specs is like flossing: everybody agrees that it's a good thing, but nobody does it.

I'm not sure why this is, but it's probably because most programmers hate writing documents. As a result, when teams consisting solely of programmers attack a problem, they prefer to express their solution in code, rather than in documents. They would much rather dive in and write code than produce a spec first.

At the design stage, when you discover problems, you can fix them easily by editing a few lines of text. Once the code is written, the cost of fixing problems is dramatically higher, both emotionally (people hate to throw away code) and in terms of time, so there's resistance to actually fixing the problems. Software that wasn't built from a spec usually winds up badly designed and the schedule gets out of control. This seems to have been the problem at Netscape, where the first four versions grew into such a mess that management stupidly decided to throw out the code and start over. And then they made this mistake all over again with Mozilla, creating a monster that spun out of control and took several years to get to alpha stage.

My pet theory is that this problem can be fixed by teaching programmers to be less reluctant writers by sending them off to take an intensive course in writing. Another solution is to hire smart program managers who produce the written spec. In either case, you should enforce the simple rule "no code without spec".

Learn all about writing specs by reading my 4-part series.

8. Do programmers have quiet working conditions?
There are extensively documented productivity gains provided by giving knowledge workers space, quiet, and privacy. The classic software management book Peopleware documents these productivity benefits extensively.

Here's the trouble. We all know that knowledge workers work best by getting into "flow", also known as being "in the zone", where they are fully concentrated on their work and fully tuned out of their environment. They lose track of time and produce great stuff through absolute concentration. This is when they get all of their productive work done. Writers, programmers, scientists, and even basketball players will tell you about being in the zone.

The trouble is, getting into "the zone" is not easy. When you try to measure it, it looks like it takes an average of 15 minutes to start working at maximum productivity. Sometimes, if you're tired or have already done a lot of creative work that day, you just can't get into the zone and you spend the rest of your work day fiddling around, reading the web, playing Tetris.

The other trouble is that it's so easy to get knocked out of the zone. Noise, phone calls, going out for lunch, having to drive 5 minutes to Starbucks for coffee, and interruptions by coworkers -- especially interruptions by coworkers -- all knock you out of the zone. If a coworker asks you a question, causing a 1 minute interruption, but this knocks you out of the zone badly enough that it takes you half an hour to get productive again, your overall productivity is in serious trouble. If you're in a noisy bullpen environment like the type that caffeinated dotcoms love to create, with marketing guys screaming on the phone next to programmers, your productivity will plunge as knowledge workers get interrupted time after time and never get into the zone.

With programmers, it's especially hard. Productivity depends on being able to juggle a lot of little details in short term memory all at once. Any kind of interruption can cause these details to come crashing down. When you resume work, you can't remember any of the details (like local variable names you were using, or where you were up to in implementing that search algorithm) and you have to keep looking these things up, which slows you down a lot until you get back up to speed.

Here's the simple algebra. Let's say (as the evidence seems to suggest) that if we interrupt a programmer, even for a minute, we're really blowing away 15 minutes of productivity. For this example, lets put two programmers, Jeff and Mutt, in open cubicles next to each other in a standard Dilbert veal-fattening farm. Mutt can't remember the name of the Unicode version of the strcpy function. He could look it up, which takes 30 seconds, or he could ask Jeff, which takes 15 seconds. Since he's sitting right next to Jeff, he asks Jeff. Jeff gets distracted and loses 15 minutes of productivity (to save Mutt 15 seconds).

Now let's move them into separate offices with walls and doors. Now when Mutt can't remember the name of that function, he could look it up, which still takes 30 seconds, or he could ask Jeff, which now takes 45 seconds and involves standing up (not an easy task given the average physical fitness of programmers!). So he looks it up. So now Mutt loses 30 seconds of productivity, but we save 15 minutes for Jeff. Ahhh!

9. Do you use the best tools money can buy?
Writing code in a compiled language is one of the last things that still can't be done instantly on a garden variety home computer. If your compilation process takes more than a few seconds, getting the latest and greatest computer is going to save you time. If compiling takes even 15 seconds, programmers will get bored while the compiler runs and switch over to reading The Onion, which will suck them in and kill hours of productivity.

Debugging GUI code with a single monitor system is painful if not impossible. If you're writing GUI code, two monitors will make things much easier.

Most programmers eventually have to manipulate bitmaps for icons or toolbars, and most programmers don't have a good bitmap editor available. Trying to use Microsoft Paint to manipulate bitmaps is a joke, but that's what most programmers have to do.

At my last job, the system administrator kept sending me automated spam complaining that I was using more than ... get this ... 220 megabytes of hard drive space on the server. I pointed out that given the price of hard drives these days, the cost of this space was significantly less than the cost of the toilet paper I used. Spending even 10 minutes cleaning up my directory would be a fabulous waste of productivity.

Top notch development teams don't torture their programmers. Even minor frustrations caused by using underpowered tools add up, making programmers grumpy and unhappy. And a grumpy programmer is an unproductive programmer.

To add to all this... programmers are easily bribed by giving them the coolest, latest stuff. This is a far cheaper way to get them to work for you than actually paying competitive salaries!

10. Do you have testers?
If your team doesn't have dedicated testers, at least one for every two or three programmers, you are either shipping buggy products, or you're wasting money by having $100/hour programmers do work that can be done by $30/hour testers. Skimping on testers is such an outrageous false economy that I'm simply blown away that more people don't recognize it.

Read Top Five (Wrong) Reasons You Don't Have Testers, an article I wrote about this subject.

11. Do new candidates write code during their interview?
Would you hire a magician without asking them to show you some magic tricks? Of course not.

Would you hire a caterer for your wedding without tasting their food? I doubt it. (Unless it's Aunt Marge, and she would hate you forever if you didn't let her make her "famous" chopped liver cake).

Yet, every day, programmers are hired on the basis of an impressive resumé or because the interviewer enjoyed chatting with them. Or they are asked trivia questions ("what's the difference between CreateDialog() and DialogBox()?") which could be answered by looking at the documentation. You don't care if they have memorized thousands of trivia about programming, you care if they are able to produce code. Or, even worse, they are asked "AHA!" questions: the kind of questions that seem easy when you know the answer, but if you don't know the answer, they are impossible.

Please, just stop doing this. Do whatever you want during interviews, but make the candidate write some code. (For more advice, read my Guerrilla Guide to Interviewing.)

12. Do you do hallway usability testing?
A hallway usability test is where you grab the next person that passes by in the hallway and force them to try to use the code you just wrote. If you do this to five people, you will learn 95% of what there is to learn about usability problems in your code.

Good user interface design is not as hard as you would think, and it's crucial if you want customers to love and buy your product. You can read my free online book on UI design, a short primer for programmers.

But the most important thing about user interfaces is that if you show your program to a handful of people, (in fact, five or six is enough) you will quickly discover the biggest problems people are having. Read Jakob Nielsen's article explaining why. Even if your UI design skills are lacking, as long as you force yourself to do hallway usability tests, which cost nothing, your UI will be much, much better.

Four Ways To Use The Joel Test

Rate your own software organization, and tell me how it rates, so I can gossip.
If you're the manager of a programming team, use this as a checklist to make sure your team is working as well as possible. When you start rating a 12, you can leave your programmers alone and focus full time on keeping the business people from bothering them.
If you're trying to decide whether to take a programming job, ask your prospective employer how they rate on this test. If it's too low, make sure that you'll have the authority to fix these things. Otherwise you're going to be frustrated and unproductive.
If you're an investor doing due diligence to judge the value of a programming team, or if your software company is considering merging with another, this test can provide a quick rule of thumb.

posted @ 2006-02-17 22:02 Raymond的Java笔记阅读(502) | 评论 (0) | 编辑收藏

在lucene实现按关键字出现次数排序的列表

需求：在lucene索引中建立了很多关键字的索引，想获得一个当前用户的关键字列表，并且每个关键字还带有使用了多少次的信息。

解决方法：
使用自定义的HitCollector对象，代码如下

import java.io.IOException;

import java.util.ArrayList;

import java.util.Collections;

import java.util.HashMap;

import java.util.Iterator;

import java.util.Set;

import org.apache.lucene.document.Document;

import org.apache.lucene.search.HitCollector;

import org.apache.lucene.search.IndexSearcher;

public class TagCollector extends HitCollector {

private IndexSearcher searcher;

private HashMap<String,Integer> tagList=new HashMap<String,Integer>();

public TagCollector(IndexSearcher searcher) {

this.searcher=searcher;

}

@Override

public void collect(int docID, float score) {

try {

Document doc=searcher.doc(docID);

String[] tagValues=doc.getValues("tag");

if (tagValues!=null) {

for (int i=0;i<tagValues.length;i++) {

addTagCount(tagValues[i]);

}

} catch (IOException e) {

e.printStackTrace();

}

private void addTagCount(String tagName) {

int count=1;

if (tagList.containsKey(tagName)) {

count=(Integer)tagList.get(tagName)+1;

}

tagList.put(tagName,count);

}

public HashMap<String,Integer> getTagList() {

return tagList;

}

@SuppressWarnings("unchecked")

public ArrayList<TagSummary> getSortedTagList(boolean ascending) {

ArrayList<TagSummary> list=new ArrayList<TagSummary>();

Iterator keyIterator=tagList.keySet().iterator();

while (keyIterator.hasNext()) {

String key=(String)keyIterator.next();

int value=tagList.get(key);

list.add(new TagSummary(key,value));

}

Collections.sort(list);

if (!ascending) {

Collections.reverse(list);

}

return list;

}

功能说明：每个搜索到的hits，都会调用这个方法的collect方法，因此可以在这个对象当中放一个HashMap，累计记录每个关键字得到的次数。

排序部分用另外的一个TagSummary类来获得，这里就不详细给出了。

问题：这是一个直观的方法，但是相信频繁调用这样的方法会造成服务器的严重负担。可以考虑一下用缓存的方法，在没有关键字未曾发生改变之前，只在第一次调用这样的方法，之后把结果缓存在数据表或者内存当中。有更新的时候，通过版本号对比以决定是否需要更新。

posted @ 2006-02-04 14:26 Raymond的Java笔记阅读(1755) | 评论 (0) | 编辑收藏

Resin + Struts 的中文乱码问题解决

问题：
使用Struts的ActionForm接收到的中文全部是乱码，例如提交过去的“测试”字符串，得到的是“??????è????”。开头以为是传统的encoding识别的问题，但是用各种编码重新构造得到的byte[]数组，依然无法得到正确的中文。但是如果用普通的jsp来接收form的数据，中文是完全正常的。
我开始觉得是struts的流程当中，错误地使用了编码，以至最后得到的结果完全乱了。搜索了好多文章，总算找到一个比较接近的。
解决方法：
定义一个filter，filter只做一件事情，就是：
request.setCharacterEncoding("UTF-8");
在web.xml的filter mapping里,设定和struts的action同样的mapping。

解释： Filter最先拦截web请求，在这里设置了正确的CharacterEncoding，接下来各个处理的组件就不会搞错了。在没有Filter的情况下，我的resin服务器上获得的是null，估计struts不同的处理组件对null的解释和处理不太一致，导致错误的产生。

要注意我所有页面都是UTF-8编码，所以在filter里面定义了UTF-8，如果是其它的编码，这里应该相应改一下。

posted @ 2006-01-19 23:28 Raymond的Java笔记阅读(1090) | 评论 (0) | 编辑收藏

Java判定字符是否中文

判断一个字符是否中文，今天查API找到一个方法，代码如下：

   System.out.println(Character.UnicodeBlock.of('琴'));
   System.out.println(Character.UnicodeBlock.of('j'));
   System.out.println(Character.UnicodeBlock.of(3267));

运行结果：
CJK_UNIFIED_IDEOGRAPHS
BASIC_LATIN
KANNADA

其实不完全够用，因为如果得到“CJK_UNIFIED_IDEOGRAPHS”，还可能是日文或者韩文。不过对我的需求是足够了。如果要准确判断中文，去查一下unicode代码就可以了。

posted @ 2006-01-17 12:09 Raymond的Java笔记阅读(822) | 评论 (0) | 编辑收藏

实现队列的JavaScript对象

posted @ 2006-01-16 15:52 Raymond的Java笔记阅读(1015) | 评论 (2) | 编辑收藏

Resin 编译的奇怪问题解决

使用Resin 3.0开发，很奇怪Eclipse在启动了remote debug，然后加断点的时候说我的类没有加行号。我找遍了选项，明明是加了行号的呀。甚至我在一个必定会走过的类前面加个log打出来，路照走了，居然在console不见log。百思不得其解，快崩溃之前。终于想起了临时目录。

Resin默认总是在WEB-INF下面生成work和tmp目录，是放jsp编译而成的类的。我把这两个目录删除了。一切正常，断点也可以加了。

原因：应该是resin在判断类是否需要重新编译时有点问题，对于我jsp里面有使用到的类发生变化以后，调用它的jsp文件没有重新编译。导致类文件不更新，连带就出了一堆古怪的错误。

下次记住了，有问题，先删临时目录！

posted @ 2006-01-13 16:05 Raymond的Java笔记阅读(932) | 评论 (0) | 编辑收藏

Tag 年？

初初接触Tag的，少有不激动的。就因为它既熟悉又陌生，既简单又复杂。乍看之下觉得两下子能弄一个Tag系统出来，再想想又觉得深不可测。可挖掘的东西似乎还有很多很多......

基于Tag的RSS订阅是个好主意。不过首先的确让人想到spam。Tag同样存在信息过载的问题。最原始的提高搜索引擎排名的方法，已经太多人用过了：在meta的keyword里面贴一堆无关的关键字：二手、超女、手机、新闻、性感、美女、贴图、援交、笔记本...... 如何保证不在SPAM Blog上面乱贴Tag？

限制Tag数量？没用，辛苦点多copy几个副本就完了。

技术手段限制？说到底还是成本，技术手段有高低之分，有成本大小区别的。除非对Spam特别不介意的，否则多数用户狂热过一段之后，要看有效信息的，最终还是回到少数几个“权威”这里。这些“权威”不是特定个人就是公司，前者就不是基于Tag了，后者提供的Tag，其实还是Web 1.0的SP，再加一个Tag。

当然没必要为了2.0而2.0，我们关心如何更有效获取自己想要的有效信息，尽量过滤无关信息（尤其是Spam）。

Tag最大的特点，我以为在“交流”二字。自己贴了个自我感觉良好的傻冒Tag，完了还想看看谁跟我一样傻... 这个将成为Tag存在的最重要的意义之一。

最有机会整合Tag功能，也是最需要整合Tag功能的网站，是那些比较重视用户之间交互的网站，例如BSP、交友圈等等

posted @ 2006-01-12 21:48 Raymond的Java笔记阅读(284) | 评论 (0) | 编辑收藏

DOM的Document对象转换到String的中文编码（乱码）问题

由一个Document对象，转换成String，这个方法我几年前写的了：

TransformerFactory tFactory = TransformerFactory.newInstance();

Transformer transformer = tFactory.newTransformer();

DOMSource source = new DOMSource(inputDoc);

StringWriter out = new StringWriter();

StreamResult result = new StreamResult(out);

transformer.transform(source, result);

out.flush();

return out.toString();

一直用是没有问题的，直到今天在resin上面运行，发现一个奇怪的现象，写入数据库的中文都变成了类似&#XXXX; 这种编码。更加奇怪的是我用应用程序的方式运行，是正常没问题的。我猜想是resin在某个地方做了设置，在网上搜索又一时找不到好方法，不想为了一个小问题转用jdom之类的包。

研究了一下，找到了解决的办法。在Transformer对象创建之后加一句：

transformer.setOutputProperty("encoding","GBK");

问题解决了。具体的机制还没有时间去了解，有空再回头研究吧。

posted @ 2006-01-09 17:39 Raymond的Java笔记阅读(3923) | 评论 (1) | 编辑收藏

使用Struts

历来对framework的感觉是“够用就可以”的原则。这次想用struts，主要希望jsp里面的逻辑代码不要太多。但代价当然是Action类会多起来。完全用Struts的设计，也不是不可以，但是复杂度提高的同时，换了一个不会用struts的人来做，要花太多时间讲解。折中了一下，决定只用Struts的Action，并且主要使用DispatchAction，这样Action的类不用太多。

至于Error和Message，暂时还是自己定义，Tag Library虽然是个好东西，但是让页面更加复杂了（个人感觉），暂时不用。

posted @ 2006-01-09 13:55 Raymond的Java笔记阅读(121) | 评论 (0) | 编辑收藏

ANT 的javax.xml.parsers.FactoryConfigurationError 错误解决办法

用Ant在JDK1.5下运行java任务时，如果涉及jaxp的DOM，会抛出异常javax.xml.parsers.FactoryConfigurationError

解决方法：

把ant的lib下面的xercesImpl.jar放到你的classpath里面。

应该是ANT对应的classloader出了问题，感觉哪里是硬编码了必须拿这个xerces的实现类。

posted @ 2006-01-06 15:19 Raymond的Java笔记阅读(1574) | 评论 (0) | 编辑收藏

界面

看到一篇文章批评google reader，说界面用了客户端的脚本变得有趣，但功能太简单。

也许是实话，但界面实在太重要了，有时甚至比功能重要。好的界面，起码第一眼吸引人，有多留一会的欲望。

posted @ 2005-12-24 22:00 Raymond的Java笔记阅读(89) | 评论 (0) | 编辑收藏

[导入]相册的自动播放效果SlideShow (ZT)

相册的自动播放效果SlideShow

写了个SlideShow的原型,利用image的complete,判断图片是否调用完全，调用完全以后才显示，否则是LOADING的图片，还考虑的是
1.每调用一张图片之前先出现loading的过程,
2.调用图片中不会因为网速和图片过大而未显示全,直接跳到下一张,要按顺序一张一张播放.
3.第一次播放的时候，因为调用图片会慢一点，重新播放是调用CACHE里的，速度就快了

代码如下：

<html>
<head>
<title>SlideShow</title>
<script language="JavaScript1.1">
<!--
var yourImages = new Array("http://blog.donews.com/images/blog_donews_com/dodo/79382/o_5540320040330075952.jpg","http://www.iqoo.com/tupian/%D0%C7%D7%F9%B1%DA%D6%BD/iqoo_1113__aries.jpg","http://blog.donews.com/images/blog_donews_com/dodo/79382/o_5540320040330081327.jpg","http://blog.donews.com/images/blog_donews_com/dodo/79382/o_5540320040330081426.jpg")

var currCount=0
var stop=false

function getimg(n){
preImages= new Image()
preImages.src = yourImages[n]
}

function autoPlay(){
if(currCount!=yourImages.length){
document.getElementById("img").style.display="none"
getimg(currCount)
document.getElementById("loadingbar").style.display="block"
setTimeout("loadingImg()",1000)
}
else{
currCount=0;
if (confirm("播放完毕,是否重新播放?")){
return autoPlay()
}
}
}

function loadingImg(){

if (preImages.complete) {
document.getElementById("img").src="http://blog.donews.com/images/blog_donews_com/dodo/49134/o_pix.gif"
document.getElementById("loadingbar").style.display="none"

document.getElementById("img").style.display="block"

document.getElementById("img").src=yourImages[currCount]
currCount=currCount+1
}
setTimeout("autoPlay()",4000)
}

//-->
</script>

</head>

</body>
</html>

文章来源:http://blog.itpub.net/post/7956/49057

posted @ 2005-12-23 09:53 Raymond的Java笔记阅读(127) | 评论 (0) | 编辑收藏

导航

常用链接

留言簿(1)

随笔分类

随笔档案

文章分类

文章档案

Java

技术漫谈

数据库

网页技术(HTML/JavaScript/CSS)

软件过程

搜索

最新评论

阅读排行榜

评论排行榜

2005年12月23日

Volatile Fields

The Joel Test: 12 Steps to Better Code

By Joel Spolsky
Wednesday, August 09, 2000
Printer Friendly Version

导航

常用链接

留言簿(1)

随笔分类

随笔档案

文章分类

文章档案

Java

技术漫谈

数据库

网页技术(HTML/JavaScript/CSS)

软件过程

搜索

最新评论

阅读排行榜

评论排行榜

2005年12月23日

Volatile Fields

The Joel Test: 12 Steps to Better Code

By Joel SpolskyWednesday, August 09, 2000Printer Friendly Version

By Joel Spolsky
Wednesday, August 09, 2000
Printer Friendly Version