2005年9月26日

五一快到了

论文还没有搞好呢。写论文对我们读两年的硕士来说还是比较的麻烦的，老板要求又比较的高，真是一个痛苦的过程。现在每天都想着早点去上班，在学校待的时间太长了。可是上过班的同学都说，在学校好，日子过的多么自由，上班了就没有什么时间休息了。也是个围城。五一没有什么计划，继续写我的论文。

posted @ 2006-04-28 15:31 小树阅读(163) | 评论 (0) | 编辑收藏

准备论文中

论文3.27就要交开题报告了，准备了几天，现在还没想好题目呢。电脑带回去了，现在每天都在实验室过，有自己的机子就是好。

posted @ 2006-02-17 13:47 小树阅读(128) | 评论 (0) | 编辑收藏

Ottinger's Rules for Variable and Class Naming

This paper grew out of some postings made on usenet, specifically comp.object, in 1997. There was some response, and so it is presented here in entirety, enhanced a bit.

Introduction

When a new developer joins a project which is already in progress, there is a steep learning curve. If the new developer already knows the methodology and programming language, some of this is reduced. If the new developer already knows the problem domain fairly well, this also shortens the ramp-up time.

There is often a great deal of artificial curve which is added to a project by decree or by accident. This has the opposite effect; it increases ramp-up time and can hurt the new developer's time-to-first-contribution considerably. And not only the first contribution, but the next several.

The goal of these rules set is to help avoid creating one type of artificial learning curve, that of deciphering or memorizing strange names.

The rules were developed in group discussions, largely by examining poor names and dissecting them to determine the cause of their "badness".

Use Pronounceable names

If you can't pronounce it, you can't discuss it without sounding like an idiot. "Well, over here on the bee cee arr three cee enn tee we have a pee ess zee kyew int, see?"

I company I know has genymdhms (generation date, year, month day, hour, minute and second) so they walked around saying "gen why emm dee aich emm ess". I have an annoying habit of pronouncing everything as-written, so I started saying "gen-yah-mudda-hims". It later was being called this by a host of designers and analysts, and we still sounded silly. But we were in on the joke, so it was fun. Fun or not, don't do that.

It would have been so much better if it had been called generation_timestamp. "Hey, Mikey, take a look at this record! The generation timestamp is tomorrow! How can that be?"

Avoid Encodings

Encoded names require deciphering. This is true for Hungarian and other `type-encoded' or otherwise encoded variable names. To allow any encoded prefixes or suffixes in code is suspect, but to require it seems irresponsible inasmuch as it requires each new employee to learn an encoding "language" in addition to learning the (usually considerable) body of code that they'll be working in.

When you worked in name-length-challenged programs, you probably violated this rule with impunity and regret. Fortran forced it by basing type on the first letter, making the first letter a `code' for the type. Hungarian has taken this to a whole new level.

We've all seen bizarre encoded naming standards for files, producing (real name) cccoproi.sc and SRD2T3. This is an artificially-created naming standard in the modern world of long filenames, though it had it's time.
This isn't intended as an attack on Hungarian notation out of malice toward Microsoft or Windows. It's a simple rule of simplifying and clarifying names. HN was pretty important back when everything was an integer handle or a long pointer, but in C++ we have (and should have) a much richer type system. We don't need HN any more. Besides, encoded names are seldom pronounceable ([#1]).

Of course, you can get used to anything, but why create an artificial learning curve for new hires? Avoid this if you can avoid it.

Don't be too cute

If the names are too clever, they will be memorable only to people who share your sense of humor and remember the joke. Will the people coming after you really remember what HolyHandGrenade is supposed to do in your program? Sure, it's cute, but maybe in this case ListItemRemover might be a better name. I've seen Monty Python's The Holy Grail, but it may take me a while to realize what you are meaning to do.

I've seen other similar cutesy namings fail.

Given the choice, choose clarity over entertainment value. It's a good practice.

Most meanings have multiple words. Pick ONE

Pick one word for one abstract function and stick with it. I hear that the Eiffel libraries excel at this, and I know that the C++ STL is very consistent. Sometimes the names seem a little odd (like pop_front for a list), but being consistent will reduce the overall learning curve for the whole library.

For instance, it's confusing to have fetch, retrieve and get as same-acting methods of the different classes. How do you remember which method name goes with which class? Sadly, you often have to remember who wrote the library in order to remember which term was used. Otherwise, you spend an awful lot of time browsing through headers and previous code samples. This is a considerably worse practice than the use of encodings.

Likewise, it's confusing to have a controller and a manager and a driver in the same process. What is the essential difference between a DeviceManager and a ProtocolController? Why are both not controllers, or both not managers? The name leads you to expect two objects that have very different type as well as having different classes.

We can take advantage of this to create consistent interfaces and simplify learning dramatically.

Most words have multiple meanings

Don't use the same word for two purposes, if you can at all avoid it.

This is the inverse of the previous rule. When you use different terms, it leads one to think that there are different types underlying them. If I use DeviceManager and ProtocolManager, it leads one to expect the two to have very similar interfaces. If I can call DeviceManager::add(), I should be able to call ProtocolManager::add(). Why? Because the name created an association between the two. I expect to see *Manager::add() now.

If you use the same word, but you have very different interfaces, this isn't a total evil (see #12 ), but it does cause some confusion. If you system or your module is small enough, or your controls rigorous enough to prevent synonyms, then that's great.

If you're learning a framework, though, you need to be most careful not to be fooled by synonyms. While you should be able to count on the names denoting type, you frequently cannot.

Remember also that it's not polite at all to have the same name in two scopes.

Nouns and Verb Phrases

Classes and objects should have noun or noun phrase names.

There are some methods (commonly called "accessors") which calculate and/or return a value. These can and probably should have noun names. This way accessing a person's first name can read like:

        string x = person.name();

Other methods (sometimes called "manipulators", but not so commonly anymore) cause something to happen. These should have verb or verb-phrase names. This way, changing a name would read like:

        fred.changeNameTo("mike")

As a class designer, does this sound boringly unimportant? If so, then go write code that uses your classes. The best way to test an interface is to use it and look for ugly, contrived, or confusing text. This really helps.

Use Solution Domain Names

Go ahead, use computer science (CS) terms, algorithm names, pattern names, math terms, etc.

Yeah, it's a bit heretical, but you don't want your developers having to run back and forth to the customer asking what every name means if they already know the concept by a different name.

We're talking about code here, so you're more likely to have your code maintained by a CS major or informed programmer than by a domain expert with no programming background. End users of a system very seldom read the code, but the maintainers have to.

Also Use Problem Domain Names

When there is no `programmer-ese' for what you're doing, use the name from the problem domain. At least the programmer who maintains your code can ask his boss what it means.

In analysis, of course, this is the superior rule to [Use Solution Domain Names], because the end-user is the target audience.

Avoid Mental Mapping

Readers shouldn't have to mentally translate your names into other names they already know.

There are some unfortunate examples for this. One of them is Microsoft's choice to call the things that walk through a list Enumerators instead of Iterators. This is sad because the term iterator is in common use in software circles and was completely appropriate to the domain (see Pick One ) and also because the term enumeration typically has a very different meaning (see Multiple Meanings ). Between the two, most developers have to translate enumerator to iterator mentally as the conversations about such things go on.

This problem generally arises from a choice to use neither problem domain terms nor solution domain terms.

Nothing is intuitive

Sadly, and in contradiction to the above, all names require some mental mapping, since this is the nature of language. If you use a term which might not be known to your audience, you must map it to the concept you'd like it to represent.

For this reason, most important names should be in a glossary or should be explained in comments at least. Even if they're parameters or local variables. Even if they're inside the static member of a class, unless the term is completely in harmony with all of these naming rules.

Avoid Disinformation

Avoid words which already mean something else. For example, "hp", "aix", and "sco" would be horrible variable names because they are the names of Unix platforms or variants. Even if you are coding a hypotenuse and "hp" looks like a good abbreviation, it violates too many rules and also is disinformative.

Likewise don't refer to a grouping of accounts as an AccountList unless it's actually a list. A list means something to CS people. It denotes a certain type of data structure. If the container isn't a list, you've disinformed the programmer who has to maintain your code. AccountGroup or BunchOfAccounts would have been better.

The absolute worse example of this would be the use of lower-case L or uppercase o as variable names, especially in combination. The problem, of course is in code where such things as this occur:

    int a = l;
    if ( O = l )
        a = O1;
    else
        l = 0;

You think that I made this one up, right? Sorry. I've examined code this year (1997) where such things were abundant. It's a great technique for shrouding your code.

When I complained, one author told me that I should use a different font so that the differences were more obvious. I think that the problem could be more easily and finally corrected by search-and-replace than by publishing a requirement that all future readers to choose Font X..

Names are only Meaningful in Context

There are few names which are meaningful in and of themselves. Most, however are not. Instead, you need to place names in context for your reader by enclosing them in classes, well-named functions, or comments.

The term `tree' needs some disambiguation, for example if the application is a forestry application. You may have syntax trees, red-black or b-trees, and also elms, oaks, and pines. The word `tree' is a good word, and is not to be avoided, but it must be placed in context every place it is used.

If you review a program or enter into a conversation where the word "tree" could mean either, and you aren't sure, then the author (speaker) will have to clarify.

Don't add Artificial Context

In an imaginary application called "Gas Station Deluxe", it is a bad idea to prefix every class with `GSD' if there is a chance that the class might later be used in "Inventory Manager" (at which time the prefix becomes meaningless).

Likewise, say you invented a `Mailing Address' class in GSD's accounting module, and you named it AccountAddress. Later, you need a mailing address for your customers. Do you use `AccountAddress'?

In both these cases, the naming reveals an earlier short-sightedness regarding reuse. It shows that there was a failing at the design level to look for common classes across an application.

Sadly, this is the standard being used by many Java authors. Even in C++, this is becoming increasingly common. We need language support for this type of work. I've not had too much trouble with it in Python, but I'm watching out. You should also.

The names `accountAddress' and `customerAddress' are fine names for instances of the class.

No Disambiguation without Differentiation

This is a problem that usually arises from writing code solely for the compiler/interpreter. You can't have the same name referring to two things in the same scope, so you change one of them. Well, that's better than misspelling one (I've seen code that looks like this was intentional, and correcting the spelling prevented compiles due to symbol clashes), but there should be some fundamental change in name that make it clear that they are different.

Imagine that you have a Product class. If you have another called ProductInfo or ProductData, you have failed to make the names different. Info and Data are like "stuff": basically meaningless. Likewise, using the words Class or Object in an OO system is so much noise; can you imagine having CustomerObject and Customer as two different class names?

MoneyAmount is no better than `money'. CustomerInfo is no better than Customer. The word `variable' should never appear in a variable name. The word `table' should never appear in a table name. How is NameString better than Name? Would a Name ever be a floating point number? Probably not. If so, it breaks an earlier rule about disinformation.

There is an application I know of where this is illustrated. I've changed the name of the thing we're getting to protect the guilty, but the exact form of the error is:

         getSomething();
         getSomethings();
         getSomethingInfo();

The second tells you there are many of these things. The first lets you know you'll get one, but which? The third tells you nothing more than the first, but the compiler (and hopefully the author) can tell them apart. You are going to have to work harder.

Try to disambiguate in such a way that the reader knows what the different versions offer her, instead of merely that they're different.

Final Words ...

The hardest thing about choosing good names is that it requires good descriptive skills and a shared cultural background. This is a teaching issue, rather than a technical, business, or management issue. As a result many people in this field don't do it very well.

Follow some of these rules, and see if you don't improve the readability of your code. If you are maintaining someone else's code, make changes to resolve these problems. It will pay off in the long run.

posted @ 2006-02-17 13:43 小树阅读(601) | 评论 (0) | 编辑收藏

等待的日子

又是好久没有来了，看样子是很难坚持下来了。工作都找好了，签了中兴通讯，前段时间收到广东北电的offer，可惜来的太晚，催的太急。这次的找工作早早的结束了，总的来说也还是比较的满意了。自从北电以后，几乎每次都能通过笔试，进入面试，也拿到了一些offer。选了中兴也有很多机缘巧合，差了一点点，就不知道签到哪个公司去了，呵呵。现在还不知道进公司能做什么，岗位还没有分配，报的是研发，应该是不会变吧。等待分配的结果，想早点去实习了。

posted @ 2005-12-21 00:47 小树阅读(440) | 评论 (1) | 编辑收藏

复习啊复习

开学了，来招聘的公司还不多，都要等到十月份以后吧。复习Java编程思想中，还有数据结构等，为十月份的笔试面试做个准备。

posted @ 2005-09-26 18:50 小树阅读(207) | 评论 (0) | 编辑收藏

常用链接

留言簿(2)

随笔档案

文章档案

搜索

最新评论

阅读排行榜

评论排行榜