Decode360's Blog

业精于勤而荒于嬉 QQ:150355677 MSN:decode360@hotmail.com

  BlogJava :: 首页 :: 新随笔 :: 联系 ::  :: 管理 ::
  302 随笔 :: 26 文章 :: 82 评论 :: 0 Trackbacks
    这是[SQL.Puzzles]的第二篇,都是一些简单的SQL处理,但是整体上来看,这本书还是有点深度的,不像别的一些SQL入门级读物,举的例子都很弱智,起码还考虑到了很多实际工作上会遇到的问题。不过这篇里还是比较简单的,主要是要关注外键删除和工作日的考虑,这个问题非常实际。[SQL.Puzzles]再转这么一篇,以后就等有看着顺眼的SQL再转了,太简单的就直接忽视。
 
 
. ABSENTEES
 
    This problem was presented on the MS ACCESS forum on CompuServeby Jim Chupella. He wanted to create a database that tracks employeeabsentee rates. Here is the table you will use:
    CREATE TABLE Absenteeism
    (emp_id INTEGER NOT NULL REFERENCES Personnel (emp_id),
     absent_date DATE NOT NULL,
     reason_code CHAR (40) NOT NULL REFERENCES ExcuseList(reason_code),
     severity_points INTEGER NOT NULL CHECK (severity_pointsBETWEEN 1 AND 4),
     PRIMARY KEY (emp_id, absent_date));

 
    An employee ID number identifies each employee. The reason_codeis a short text explanation for the absence (for example, “hit by beertruck,” “bad hair day,” and so on) that you pull from an ever-growingand imaginative list, and severity point is a point system that scores thepenalty associated with the absence.
 
    If an employee accrues 40 severity points within a one-year period,you automatically discharge that employee. If an employee is absentmore than one day in a row, it is charged as a long-term illness, not as atypical absence. The employee does not receive severity points on thesecond, third, or later days, nor do those days count toward his or hertotal absenteeism.
 
    Your job is to write SQL to enforce these two business rules, changingthe schema if necessary.
 
 
Answer #1
 
    Looking at the first rule on discharging personnel, the most commondesign error is to try to drop the second, third, and later days from the
table. This approach messes up queries that count sick days, and makeschains of sick days very difficult to find.
    The trick is to allow a severity score of zero, so you can track the longtermillness of an employee in the Absenteeism table. Simply change the
severity point declaration to “CHECK (severity_points BETWEEN 0AND 4)” so that you can give a zero to those absences that do not count.

    This is a trick newbies miss because storing a zero seems to be a waste ofspace, but zero is a number and the event is a fact that needs to be noted.
    UPDATE Absenteeism
       SET severity_points= 0,
           reason_code = 'long term illness'
     WHERE EXISTS
           (SELECT *
              FROM Absenteeism AS A2
             WHERE Absenteeism.emp_id = A2.emp_id
               AND Absenteeism.absent_date = (A2.absent_date -INTERVAL 1 DAY));

 
    When a new row is inserted, this update will look for another absenceon the day before and change its severity point score and reason_code in
accordance with your first rule.
 
    The second rule for firing an employee requires that you know whathis or her current point score is. You would write that query as follows:
    SELECT emp_id, SUM(severity_points)
      FROM Absenteeism
     GROUP BY emp_id;

 
    This is the basis for a grouped subquery in the DELETE statementyou finally want. Personnel with less than 40 points will return a NULL,and the test will fail.
    DELETE FROM Personnel
     WHERE emp_id = (SELECT A1.emp_id
                       FROM Absenteeism AS A1
                      WHERE A1.emp_id = Personnel.emp_id
                      GROUP BY A1.emp_id
                     HAVING SUM(severity_points) >= 40);

 
    The GROUP BY clause is not really needed in SQL-92, but some olderSQL implementations will require it.
 
 
Answer #2

    Bert Scalzo, a senior instructor for Oracle Corporation, pointed out thatthe puzzle solution had two flaws and room for performance
improvements.The flaws are quite simple. First, the subquery does not check forpersonnel accruing 40 or more severity points within a one-year period,
as required. It requires the addition of a date range check in the WHERE
    clause:
    DELETE FROM Personnel
     WHERE emp_id = (SELECT A1.emp_id
                       FROM Absenteeism AS A1
                      WHERE A1.emp_id = Personnel.emp_id
                        AND absent_date
                            BETWEEN CURRENT_TIMESTAMP - INTERVAL 365 DAYS
                        AND CURRENT_TIMESTAMP
                      GROUP BY A1.emp_id
                     HAVING SUM(severity_points) >= 40);

 
    Second, this SQL code deletes only offending personnel and not theirabsences. The related Absenteeism row must be either explicitly orimplicitly deleted as well. You could replicate the above deletion for theAbsenteeism table. However, the best solution is to add a cascadingdeletion clause to the Absenteeism table declaration:
 
    CREATE TABLE Absenteeism
    ( ... emp_id INTEGER NOT NULL
      REFERENCES Personnel(emp_id)
     ON DELETE CASCADE,
     ...);

 
    The performance suggestions are based on some assumptions. If youcan safely assume that the UPDATE is run regularly and people do notchange their departments while they are absent, then you can improvethe UPDATE command’s subquery:
    UPDATE Absenteeism AS A1
       SET severity_points = 0,
           reason_code = 'long term illness'
     WHERE EXISTS
           (SELECT *
              FROM absenteeism as A2
             WHERE A1.emp_id = A2.emp_id
               AND (A1.absent_date + INTERVAL 1 DAY) =A2.absent_date);

    There is still a problem with long-term illnesses that span weeks. Thecurrent situation is that if you want to spend your weekends being sick,that is fine with the company. This is not a very nice place to work. If anemployee reports in absent on Friday of week number 1, all of weeknumber 2, and just Monday of week number 3, the UPDATE will catchonly the five days from week number 2 as long-term illness. The Fridayand Monday will show up as sick days with severity points. The subqueryin the UPDATE requires additional changes to the missed-date chaining.
 
    I would avoid problems with weekends by having a code forscheduled days off (weekends, holidays, vacation, and so forth) thatcarry a severity point of zero. A business that has people workingweekend shifts would need such codes.
 
    The boss could manually change the Saturday and Sunday “weekend”codes to “long-term illness” to get the UPDATE to work the way youdescribed. This same trick would also prevent you from losing scheduledvacation time if you got the plague just before going on a cruise. If theboss is a real sweetheart, he or she could also add compensation days forthe lost weekends with a zero severity point to the table, or reschedule anemployee’s vacation by adding absences dated in the future.
 
    While I agreed that I left out the aging on the dates missed, I willargue that it would be better to have another DELETE statement thatremoves the year-old rows from the Absenteeism table, to keep the sizeof the table as small as possible.
 
    The expression
    (BETWEEN CURRENT_TIMESTAMP - INTERVAL 365 DAYS AND
     CURRENT_TIMESTAMP)

 
    could also be
    (BETWEEN CURRENT_TIMESTAMP - INTERVAL 1 YEAR AND
     CURRENT_TIMESTAMP),

 
    so the system would handle leap years. Better yet, DB2 and some otherSQL products have an AGE(date1) function, which returns the age inyears of something that happened on the date parameter. You wouldthen write (AGE(absent_date) >= 1) instead.
 
 
Answer #3

    Another useful tool for this kind of problem is a Calendar table, whichhas the working days that can count against the employee. In the 10years since this book was first written, this has become a customary SQLprogramming practice.
    SELECT A.emp_id,
           SUM(A.severity_points) AS absentism_score
      FROM Absenteeism AS A, Calendar AS C
     WHERE C1.cal_date = A.absent_date
       AND A.absent_date
           BETWEEN CURRENT_TIMESTAMP - INTERVAL 365 DAYS
           AND CURRENT_TIMESTAMP
       AND C1.date_type = ‘work’
     GROUP BY emp_id
    HAVING SUM(A.severity_points)>= 40;

 
    Some people will also have a column in the Calendar table thatJulianizes the working days. Holidays and weekends would carry thesame Julian number as the preceding workday. For example
    (cal_date,Julian_workday) :
    ('2006-04-21', 42) – Friday
    ('2006-04-22', 42) – Saturday
    ('2006-04-23', 42) – Sunday
    ('2006-04-24', 43) – Monday
 
    You do the math from the current date’s Julian workday number tofind the start of their adjusted one-year period.
 
 
 
 




-The End-

posted on 2009-03-21 19:51 decode360-3 阅读(233) 评论(0)  编辑  收藏 所属分类: Toy

只有注册用户登录后才能发表评论。


网站导航: