java.sql.BatchUpdateException: IO Error: Connection reset

during analysis of "IO Error: Connection reset", many articles mentioned that it could be caused by java security code (accessing /dev/random) used in JDBC connection. However it is not the root cause in my case. In my environment, Java already use /dev/urandom. 1. $JAVA_HOME/jre/lib/security/java.security securerandom.source=file:/dev/./urandom 2. check with strace. only -Djava.security.egd=file:/dev/../dev/urandom will trigger system call (read on /dev/urandom) all other other path format like below are OK. -Djava.security.egd=file:/dev/./urandom -Djava.security.egd=file:///dev/urandom 3. Keep checking the retropy size, I have never seen it is exhaused. while [ 1 ]; do cat /proc/sys/kernel/random/entropy_avail sleep 1 done usually the avail is in the range from 1000 to 3000. so far, there is no clue about the root cause of "IO Error: Connection reset".

Lessons learned - Oracle GI and Database Installation on SUSE 12

I encountered many issue during installation of Oracle Grid Infrastructure(GI) and Database; with the help of ariticle and documents found through Google search engine, I finally made it. for records, here is the details issues encountered and solutions applied. Major issues were encountered during GI installation. Pre-installation tasks. Issue 1: swapspace is not big enough; (1.3.1 Verify System Requirements) grep MemTotal /proc/meminfo 264G grep SwapTotal /proc/meminfo 2G during OS installation, I take default option and swap space is only 2G. Oracle recommend to have more than 16G swap space in case of more that 32G RAM. dd if=/dev/zero of=/home/swapfile bs=1024 count=33554432 33554432+0 records in 33554432+0 records out 34359738368 bytes (34 GB) copied mkswap /home/swapfile mkswap /home/swapfile chmod 0600 /home/swapfile lessons learned: setup swap space properly according to DB requirement when installing OS. Issue 2: cannot find oracleasm-kmp-default from Oracle site. (1.3.6 Prepare Storage for Oracle Automatic Storage Management) install oracleasmlib and oracleasm-support is easy, just download them from Oracle and install them; Originally oracleasm kernel is provided by Oracle, but now I cannot find it from Oracle; finally I realized that oracleasm kernel is now provided by OS vendor; In my case, it should be installed from SUSE disk; a. to get its name oracleasm-kmp-default zypper se oracle b. map dvd and install zypper in oracleasm-kmp-default rpm -qa|grep oracleasm oracleasm-kmp-default-2.0.8_k3.12.49_11-3.20.x86_64 oracleasm-support-2.1.8-1.SLE12.x86_64 oracleasmlib-2.0.12-1.SLE12.x86_64 asm configure -i asm createdisk DATA /dev/<...> asm listdisks --DATA ls /dev/oracleasm/disks Installation tasks: Issue 3: always failed due to user equivalence check after starting installer OUI with user oracle. however if I manully check with runcluvfy, no issue found at all. ./runcluvfy.sh stage -pre crsinst -n , -verbose I worked around it by using another user to replace user oracle. but it triggered next issue. Issue 4: cannot see ASM disks in OUI. no matter how I change the disk dicovery path. the disk list is empty. but I can find disk manully. /usr/sbin/oracleasm-discover 'ORCL:*' Discovered disk: ORCL:DATA Root cause is that the ASM is configured and created with user oracle. and I aming installing GI with different user other than oracle; so I cannot see the Disk created. change owner of disk device file solved the issue. ls /dev/oracleasm/disks chown /dev/oracleasm/disks -R Issue 5: root.sh execution failed. Failed to create keys in the OLR, rc = 127, Message: clscfg.bin: error while loading shared libraries: libcap.so.1: cannot open shared object file: No such file or directory fixed the issue with command below: zypper in libcap1 ohasd failed to start Failed to start the Clusterware. Last 20 lines of the alert log follow: 2016-07-24 23:10:28.502: [client(1119)]CRS-2101:The OLR was formatted using version 3. I found a good document from SUSE, Oracle RAC on SUSE Linux Enterprise Server 12 - x86_64, it make it clear that SUSE 12 is supported by Oracle GI, it also mentioned Patch 18370031. "During the Oracle Grid Infrastructure installation, you must apply patch 18370031 before configuring the software that is installed. " The patch 18370031 is actually mentioned in "Oracle quick installation guide on Linux", but not mentioned in "Oracle quick installation guide on Linux". I majored followed up with later one and missed Patch 18370031. issue disappeared after I installed the patch 18370031. ./OPatch/opatch napply -oh -local /18370031 Errors in file : ORA-27091: unable to queue I/O ORA-15081: failed to submit an I/O operation to a disk ORA-06512: at line 4 solved by change owner of disk DATA related file ls -l /dev/oracleasm/iid chown on folder /dev/oracleasm/iid and some .* hidden file. Issue during DB installation Issue 6: report error: in invoking target 'agent nmhs' vi $ORACLE_HOME/sysman/lib/ins_emagent.mk Search for the line $(MK_EMAGENT_NMECTL) Change it to: $(MK_EMAGENT_NMECTL) -lnnz11 refer to https://community.oracle.com/thread/1093616?tstart=0

Meta Information

Just use this blog to share some meta information.


Notes on Gentoo Installation

After two weeks' struggle, I have successfully installed Gentoo, a popular GNU/Linux Distribution. For Records, the obstacles I encountered are listed below. (but I can not remember the solution exactly)

0. failed to emerge gpm when I install the links package. If I recall correctly, it is resolved by install gpm manually

1. I encounter issue when I install glib 2.22.5. no update-desktop-database. which is in dev-util/desktop-file-utils. When I try to emerge it, there is a circular dependency on glib. no solution and I forget How I resolve the problem.

2. later after I install glib, with ~amd64 keyword I can install gpm-1.20.6, but it conflicts with the manually inatalled gpm. I remove the conflicted file and emerge successfully.

3. Failed to emerge tiff. edit packages.keywords to add the following. / ~amd64 I am able to use latest tiff in beta-version, which is unstable and masked out.

4. later atk-1.28.0 failed to emerge. edit /etc/make.conf with the following. FEATURES="-stricter". then emerge successfully with only some complain. with out this seting. the warining from GCC will cause that emerge fail.

5. when I run emerge --update system actually gcc will be upgraded from 4.3.4 to 4.4.3. but it failed because of compilation warning, again. add "-stricter" into Features variable in /etc/make.conf work around it.

6. The installation takes a long time, the KDE itself take more than 10 hours. There is still a lot of improvement space! Anyway, it is nice to be able to use it daily.

在C:\Documents and Settings\<user_name>\Application Data\Subversion\servers文件中加入

http-proxy-host = ***.**.com
http-proxy-port = 8080



call ttGridCreate('$TT_GRID');
call ttGridCreate(' $TT_GRID');

1. 直接访问对象的属性。
2. 用方法访问对象的属性。
3. 用Map来存储和访问。
4. 反射-Field 访问。
5. 反射-Method访问。

 * 100 field access, 14,806<br/>
 * 100 method access, 20,393<br/>
 * 100 map access, 66,489<br/>
 * 100 reflection field access, 620,190<br/>
 * 100 reflection method access, 1,832,356<br/>
 *100000 field access, 2,938,362
 *100000 method access, 3,039,772
 *100000 map access, 10,784,052
 *100000 reflection field access, 144,489,034
 *100000 reflection method access, 37,525,719 <br/>
1。getter/setter 的性能已经接近直接属性访问(大约慢50%),没有必要担心getter/setter的性能而采用直接属性访问。


Tomcat Source Code Reading

0. I am reading the source code of Tomcat 6.0.26. To pay off the effort,
I documents some notes for record. Thanks for the articles about Tomcat
source code, especially the book <<How Tomcat works>>.

1. They are two concepts about server, one is called Server, which
is for managing the Tomcat (start and stop); another is called Connector,
which is the server to serve the application request. they are on the different
ports. The server.xml clearly show the difference.

<Server port="8005" shutdown="SHUTDOWN">
  <Service name="Catalina">
    <Connector port="8080" protocol="HTTP/1.1"
               redirectPort="8443" />
    <Connector port="8009" protocol="AJP/1.3" redirectPort="8443" />

although the server is the top level element, logically it should not be.
Actually in code, Bootstrap starts the service first, which
in turn start the Server and server's services.

2. My focus in on Connector part. I care how the request is services by the
Tomcat. Here are some key classes.

Connector --> ProtocolHandler (HttpProtocol
                        and AjpProtocol)                       --> JIoEndPoint
                                                                           --> Handler(Http11ConnectionHandler
                                                                           and AjpConnectionHandler)
3. Connector is most obervious class, but the entry point is not here.
The sequence is like this.

--> JioEndPoint.processSocke(Socket socket)
        -->Http11ConnectorHandler.process(Socket socket)
            -->Http11Processor.process(Socket socket)
                -->CoyoteAdapter.service(Request req, Response res)       

The core logic is in method Http11Processor.process(Socket socket)                                                  

CoyoteAdapter.service(Request req, Response res) bridges between Connector module and Container module.

Any comments are welcome. I may continue the source code reading and dig deeper into it if time permit.

Navigate forth and back with ctrl+] and ctrl+t in Cscope

It is handy to be able to navigate the source code with Ctrl + ] in Cscope, but I always forget how to navigate back and waste effort many times. So for record, Ctrl+t can navigate back in Cscope.

One more time, Ctrl+] and Ctrl+t can navigate forth and back in Cscope.

Learning Notes of TCP/IP Illustated Volume 2

How to read the source code in <<TCP/IP Illustrated Volume 2>>

1. Get the source code, original link provided in the book is not available now.
You may need to google it.

2. install cscope and vi.

3. refer to http://cscope.sourceforge.net/large_projects.html for the following steps.

It will include all the source code of the whole OS, not only the kernel.
find src -name '*.[ch]' > cscope.files

we actually only care kernel source.
find src/sys -name '*.[ch]' > cscope.files

4.  wc cscope.files
 1613  1613 45585 cscope.files

5. vim
:help cscope
then you can read the help details.

6. if you run vim in the folder where cscope.out resides. then it will be loaded

7. Try a few commands.
:cs find g mbuf
:cs find f vm.h

They works. A good start.

P.S. this book is quite old, if you know it well and can recommend some better alternative for learning TCP/IP, please post a comments, Thanks in advance.

以为用两个素数相乘,其附近存在素数的几率很高。比如, 7×11 = 77, 其附近有79,正好是素数。




写了一个程序验证了一下。16位的整数中,大概只有 10% 能使假设成立。

由于是在Proxy的网络环境,MSYSGIT 的 git clone 总是失败。需要配置如下环境变量。
export http_proxy="http://<proxy domain name>:<port>"
之后http协议git clone没有任何问题。但是用git 协议仍旧有问题。

之后发现git push 和 git pull 经常不能work。多次尝试后发现用更全的命令行参数可以解决问题。

git pull --fail
git pull origin --fail
git pull  git@github.com:ueddieu/mmix.git --it works.

It seems the Command line short cuts are lack of some user information, such as user name "git".
(which is kind of strange at the first glance.)

git push --fail
git push origin --fail
git push git@github.com:ueddieu/mmix.git master --it works.

Anyway, now I can check in code smoothly. :)

Trouble caused by un-visible blank character

There are a few cases in which the un-visible blank character will cause
problem, but it is hard to detect since they are not visible.

One famous case is the '\t' character used by Make file, it is used to mark
the start of a command. If it is replace by blank space character, it does
not work, but you can not see the difference if you only look at the make file.
This kind of problem may get the newbies crazy.

Last week, I have encounter a similar issue, which is also caused by unnecessary
 blank space.
As you may know, '\' is used as line-continuation when you have a very long line, e.g.
when you configure the class path for Java in a property file, you may have something like this.


But if you add extra blank space after the '\', then you can not get the complete
content of classpath. Because only when '\' is followed by a '\n' on Unix or '\r''\n'
on Windows, it will work as line-continuation ; otherwise, e.g '\' is followed by
' ''\n', the line is complete after the '\n', the content after that will be the start of
a new line.

Fortunately, it is easy to check this kind of extra blank space by using vi in Unix.
use command '$' to go to the end of line, if there is no extra blank space after '\',
the current position should be '\', if there are any blank space after '\', the current position
is after the '\'.

“……,I like mangoes”妈妈
“妈妈,我昨天刚教会你,又忘了?是I like watermelon。”


can not find libdb-4.7.so.
我的解决办法是,建立符号链接/usr/lib/libdb-4.7.so, 后者指向/usr/local/BerkeleyDB/lib/libdb-4.7.so

The analysis of MOR(MXOR) instruction implementation in MMIXWare

The analysis of MOR(MXOR) instruction implementation in MMIXWare
 -- A stupid way to understand the source code.
 the implementation of MOR(MXOR) is in file: mmix-arith.w
 436 octa bool_mult(y,z,xor)
 437   octa y,z; /* the operands */
 438   bool xor; /* do we do xor instead of or? */
 439 {
 440   octa o,x;
 441   register tetra a,b,c;
 442   register int k;
 443   for (k=0,o=y,x=zero_octa;o.h||o.l;k++,o=shift_right(o,8,1))
 444     if (o.l&0xff) {
 445       a=((z.h>>k)&0x01010101)*0xff;
 446       b=((z.l>>k)&0x01010101)*0xff;
 447       c=(o.l&0xff)*0x01010101;
 448       if (xor) x.h^=a&c, x.l^=b&c;
 449       else x.h|=a&c, x.l|=b&c;
 450     }
 451   return x;
 452 }
 It takes me several hours to understand the details.
 If we treat each octabyte as a matrix, each row corresponds to a byte, then
 y MOR z = z (matrix_mulitiply) y
 For a=((z.h>>k)&0x01010101)*0xff;
 (z.h>>k)&0x01010101 will get the four last bit in (z.h>>k). depends on the bit in last row,
 ((z.h>>k)&0x01010101)*0xff will expand the bit (either 0 or 1) into the whole row.
 *     0x01010101   
 =           ff
 =    ffffffff      
(depending on the last bit in each row of z, the result could be #ff00ff00. #ff0000ff, etc.)

similarily, b=((z.l>>k)&0x01010101)*0xff; will expand the last bit in each byte into the
whole byte.

over all, after these two step, the z becomes the replication of it's last row, since k vary
from 0 to 7, it will loop on all the rows actually.

 For c=(o.l&0xff)*0x01010101, it will get the last byte in o.l and populate it to other three byte.
 since it will not only or/xor h but also l. it is not necessary populate it to o.h.
 one example,
 let (z.h>>k)&0x01010101 = 0x01000101, then a= 0xff00ffff;
 let (z.l>>k)&0x01010101 = 0x01010001, then b= 0xffff00ff;
 let (o.l&0xff)=0xuv, then c= 0xuvuvuvuv;
  then a&c=0xuv00uvuv;
 consider the elements [i,j] in result x.  in this round, what value was accumalated in by operation
 it is the jth bit in last byte of o.l & ith bit in last column of z.(do not consider looping now.)
 in this round, the 64 combination of i and j, contirbute the value to the 64 bits in z.
 Noticed that o loop on y from last byte to first byte. There are 8 loop/rounds, in another round.
 say kth round.
 the elements[i,j] will accumuate the jth bit in last (k + 1)th row & the jth bit in last (k+1)th
 that means the jth column in y multiply the ith row in z. it conform to the definiton for
 z matrix_multiply y.

A piece of beautiful and trick bitwise operation code.

A detailed reading process of a piece of beautiful and trick bitwise operation code.
The following code is from MMIXWare, it is used to implement the Wyde difference between two octabyte.
     in file: "mmix-arith.w"
     423 tetra wyde_diff(y,z)
     424   tetra y,z;
     425 {
     426   register tetra a=((y>>16)-(z>>16))&0x10000;
     427   register tetra b=((y&0xffff)-(z&0xffff))&0x10000;
     428   return y-(z^((y^z)&(b-a-(b>>16))));
     429 }
It is hard to understand it without any thinking or verification, here is the process I used
to check the correctness of this algorithm.

let y = 0xuuuuvvvv;
     z = 0xccccdddd; (please note the [c]s may be different hex number.)
then y>>16 = 0x0000uuuu;
     z>>16 = 0x0000cccc;
then ((y>>16)-(z>>16)) = 0x1111gggg if #uuuu < #cccc or
     ((y>>16)-(z>>16)) = 0x0000gggg if #uuuu >= #cccc   

so variable a = 0x00010000 if #uuuu < #cccc or
   variable a = 0x00000000 if #uuuu >= #cccc
similarly, we can get
   variable b = 0x00010000 if #vvvv < #dddd or
   variable b = 0x00000000 if #vvvv >= #dddd

for (b-a-(b>>16)))), there are four different result depending on the relation between a and b.
when #uuuu >= #cccc and #vvvv >= #dddd, (b-a-(b>>16)))) = 0x00000000;
when #uuuu >= #cccc and #vvvv < #dddd, (b-a-(b>>16)))) = 0x00001111;
when #uuuu < #cccc and #vvvv >= #dddd, (b-a-(b>>16)))) = 0x11110000;
when #uuuu < #cccc and #vvvv < #dddd, (b-a-(b>>16)))) = 0x11111111;
You can see that >= map to #0000 and < map to #1111

for y-(z^((y^z)&(b-a-(b>>16)))), when (b-a-(b>>16)))) is 0x00000000, z^((y^z)&(b-a-(b>>16))) is
z^((y^z)& 0) = z^0=z, so y-(z^((y^z)&(b-a-(b>>16))))=y-z.
similarily, when (b-a-(b>>16)))) is 0x11111111, z^((y^z)&(b-a-(b>>16))) is
z^((y^z)& 1) = z^(y^z)=y, so y-(z^((y^z)&(b-a-(b>>16))))=0.

when (b-a-(b>>16)))) is 0x11110000 or 0x11110000, we can treat the y and z as two separate wydes.
each wyde in the result is correct.

You may think it is a little stupid to verify such kind of details. but for my point of view,
without such detailed analysis, I can not understand the algorithm in the code. with the hard
work like this, I successfully understand it. The pleasure deserve the effort.
I am wondering how can the author discover such a genius algorithm.

Create a MMIX simulator in Java

After reading the <<MMIX: A RISC Computer for the New Millennium>>, I am ispired to Create a MMIX simulator in Java.
Donald Knuth already created a high quality MMIX simulater in C, why I still bother to creating a new one in Java.
First, I want to learn more about how the computer works. I think re-implement a simulator for MMIX can
help me gain a better understanding.
Second, I want to exercise my Java skills.

After about one month's work, I realize that I can not finish it by myself. I am looking for the help.
If you are interested in MMIX and know Java, Please give me a hand.

Currently I have finished most of the instructions, but some important and complex one are not completed
I have developed a few JUnit TestCase for some instructions, but it's way far from covering all the instructions (there are 256 instructions total).
Few of the sample MMIX program in Donald Knuth's MMIXware package, such as cp.mmo, hello.mmo can be
simulated successfully, but there are much more to support.

To help on this project, first you need the access to the current source code. It's hosted on Google
code. Please follow the steps below to access the source code.

Use this command to anonymously check out the latest project source code:
# Non-members may check out a read-only working copy anonymously over HTTP.
svn checkout http://mmix.googlecode.com/svn/trunk/ mmix-read-only

If you are willint to help, please comment on this blog with your email address.

My confusion about kernel and corresponding clarification.

There are many questions coming into my mind when I read the Linux kernel book and source code. As time goes by, I become more knowledgeable than before and can address those questions by myself, here is the first question addressed by myself.


Q: why kernel have to map the high memory in kernel space, why not just allocate the high memory and only map it in user process.

A: Because kernel also need to access the high memory before it returned the allocated memory to user process. For example, kernel must zero the page or initialized the page for security reason. Please refer to linux device driver page 9.

Q: why not let the clib zero the page or initialize it, it saves the kernel's effort and simplifies the kernel.

A: besides Requesting memory through clib, user program can also request memory through direct System call, in this situation, the security is not guaranteed, the information in memory will be leaked.

How to substitute text in the file.

9/26/2008 8:57AM
Today I want to research the different ways to substitute text in the file. For records, I written them down.

1. use Ultra Edit, it is super easy for a Windows user if you have Ultra Edit installed.
use Ctrl + R to get Replacement Wizard and follow you intuition.

2. use VI in Unix.
will replace xx with yy.

3. use filter, such as sed, and awk in Unix.
sed -e 's/xx/yy/g' file.in > file.out
replace xx with yy in all the lines. It seem sed will not change the original input file, so I redirect the out put to file.out

WYSIWYM stands for What You See is What You Mean; WYSIWYG stands for What You See is What You Get;

Microsoft -- Word is always considered as a example of WYSIWYG. Today I have a look at the tool named LyX, which is an example of WYSIWYM. From an end user's point of view, there are more similarity than difference between them.

They both display the the resulted layout on the fly; they both provide button to typeset the document.

The difference I can see between then is -- LyX use text file, while Word use binary file. But I don't think it matters.

In my humble opinion, the real difference between Word and LyX/LaTeX is as the following. In Word, you typeset in the lower level, you can control all the details but it also need more effort. In LyX/LaTex, you typeset in higher level, you only need to figure out the logic structure of the document. The resulted layout is not decided by you, you actually just share the layout developed by the expert. I think it is the key advantage of WYSIWYM.

Trouble shooting - Fail to send out mail from application server

Yesterday, we found that the application can not send mail successfully; the performance of the module using email feature is also very bad. I suspect it caused by that the mail server host name can not be resolved in the application server.

I executed the following command
host <mail server host name>
It shows a strange IP. It means it can not properly resolve the mail server host name

Then I execute the command below.
man host
The output tells me to resort to /etc/resolv.conf
open it with
vi /etc/resolv.conf

The context is as following:
nameserver <name server 1>
nameserver <name server 2>
update the config with correct DNS server IP.

Everything is OK.

P.S. It seems that the ping and host commands are different. For some host name, I can ping it but I can not host it.

The inelegance in Operating System

The reality is far from the idealism - the inelegance in Operating System

I am interested in Operating System, after I know more and more concepts, know more and more details, I realize that the reality is far from the idealism. The root cause is the history and to some extent, it is the back compatibility. we can not afford to make a brand new thing from scratch, we need to include many old things in any things.

Let me give some example about how the history make the current Operation System become complicated and inelegant.
1. DMA
DMA stands for Direct Memory Access, which is a way to improve the parallelism in computer system. Basically, with DMA, peripheral device can access main Memory simultaneously when CPU is running. but for historical reason, in X86 platform, some DMA device only have 24 bit address line. which limit the memory scope to 16M. since X86 platform is also lack of IO-MMU to remap the address, the memory can be used in DMA is [0,16M). It definitely complicated the memory management.

2. High Memory
Since  Linux kernel has only 1G linear address space, it can not address all the 4G physical memory in 32 bit machine. This is actual a design issue in Linux for historical reason. it does not predict that some day, the physical memory will become so large. Later in order to support more than 1 G physical memory, CONFIG_HIGHMEM compile option was added. There are also other way to fix this problem, such as 4G kernel space v.s. 4G user space.

3. PAE
PAE stands for Physical Memory Extension, PAE make it possible to support up to 64G physical memory. but to me, it is just a temporary solution, does not deserve the effort. I even do not want to have a look on the corresponding document. It does not make too much sense. I prefer to directly move to 64 bit platform. 64 bit platform has its own problems though.

the above is just some inelegant in hardware. majorly cause by historical reason. I am wondering how can we keep up the quick development under the burden of history. maybe at some point, we finally need to throw away the history and move on with a brand new start.

