今天在做某项目的POC测试,甲方提供了一个三十万记录的TXT数据文件,需要把该文件的记录插入到数据库中,由于项目部的同事在搭建测试环境中用的是Mysql数据库,在把数据导入到数据库中用的是JDBC的批处理。代码如下
private void batchParseGroup(){
Connection con= null;
PreparedStatement ps =null;
try {
con= DbConnectionManager.getConnection();
con.setAutoCommit(false);
String sql = " insert into jivegroup(uri,groupname,pgroupid,description,creationdate,modificationdate,priority,selfpriority) values(?,?,?,?,?,?,?,?)";
ps= con.prepareStatement(sql);
for(int i=0;i<groupList.size();i++){
Group group = groupList.get(i);
ps.setString(1, group.getUri());
ps.setString(2, group.getName());
ps.setString(3, group.getPgroupId());
ps.setString(4, group.getName());
ps.setString(5, ""+System.currentTimeMillis());
ps.setString(6, ""+System.currentTimeMillis());
ps.setInt(7, group.getPriority());
ps.setInt(8, group.getPriority());
ps.addBatch();
if(i%100==0){
ps.executeBatch();
}
}
con.commit();
ps.executeBatch();
} catch (SQLException e) {
e.printStackTrace();
}finally{
DbConnectionManager.closeConnection(ps, con);
}
} 在测试时,发现三十万的数据居然需要十分钟左右的时间。首先想到的就是Mysql的相关配置是不是有问题,反复修改了Mysql的相应配置参数,收效甚微。
在Mysql的官网上查到如下:
http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-configuration-properties.html
关于rewriteBatchedStatements参数,Mysql官方的说明:
Should the driver use multiqueries (irregardless of the setting of "allowMultiQueries") as well as rewriting of prepared statements for INSERT into multi-value inserts when executeBatch() is called? Notice that this has the potential for SQL injection if using plain java.sql.Statements and your code doesn't sanitize input correctly. Notice that for prepared statements, server-side prepared statements can not currently take advantage of this rewrite option, and that if you don't specify stream lengths when using PreparedStatement.set*Stream(), the driver won't be able to determine the optimum number of parameters per batch and you might receive an error from the driver that the resultant packet is too large. Statement.getGeneratedKeys() for these rewritten statements only works when the entire batch includes INSERT statements.
解决办法:
下载最新的JDBC的驱动程序。
MYSQL URL的配置参数如下:
jdbc:mysql://54.200.190.80:3306/ccb_ucstar?rewriteBatchedStatements=true
经过测试。三十多万的数据。70秒内搞定!