menu

Friday, October 10, 2014

Using Prepared Statement In Java


Introduction:

     Using prepared statements, we can get faster and more secure results when we send queries to the database systems. In this write we’ll discuss both the benefits and performance impact of using prepared statements.

Why more secure ?

     Below is a java code written to execute an update statement in mysql database.Normally we should use connection pooling instead of directly connecting the database, but here we use direct connection just for explanation.

  String CONN_STRING = "jdbc:mysql://localhost:3306/DB_NAME?user=USER&password=PWD";
  String UPDATE_SQL = "UPDATE User SET age=? WHERE name=?";
  Connection conn = null;
  PreparedStatement pstmt = null;
    try {
            Class.forName("com.mysql.jdbc.Driver");
            conn = DriverManager.getConnection(CONN_STRING);
            conn.setAutoCommit(false);
            pstmt = conn.prepareStatement(UPDATE_SQL);
            pstmt.setInt(1, 30);
            pstmt.setString(2, "xxx");
            int rows = pstmt.executeUpdate();
            conn.commit();
            System.out.println(rows + " rows updated");
          } catch (SQLException e) {
               try {
                       if (conn != null && !conn.isClosed()) 
                           conn.rollback();
                           // TODO logging...
                     } catch (SQLException e1) {
                           // TODO logging...
                     }
          } catch (Exception e) {
               try {
                       if (conn != null && !conn.isClosed())
                           conn.rollback();
                           // TODO logging...
                      } catch (SQLException e1) {
                          // TODO logging...
                      }
          } finally {
                try {
                        if (pstmt != null)
                            pstmt.close();
                        if (conn != null && !conn.isClosed())
                            conn.close();
                       } catch (SQLException e) {
                               // TODO logging...
                       }
          }                      

As we see from the above code, the parameters that sql sentence expect were sent by question mark. Lets think the age and name value in the sql sentence comes from external world, ie from screen.If this is the case, when we directly add age and name value to our sql sentence instead of using prepared statement, we allow the external system to modify our sql sentence. On the other hand, with the help of prepared statement, the value that comes from external word will not be part of the sql sentence, instead it will be only a parameter. This prevents sql injection.An injected sql sentence might be
UPDATE User SET age=5 WHERE name=’xxx’ or 1=1. Here the “name” value comes from external world and include value ‘xxx’ or 1=1 and it updates all table rows with age 5. To prevent this you need to add extra controls in your java code, but thanks to the prepared statement, the name parameter cannot be send to be included in sql sentence like ‘xxx’ or 1=1.

Why better performance ?

     First, lets examine the steps that a query passes before executing. I try to be generic, but these steps may vary slightly depending on the database engine.

1.)   Parsing:

     In this step, the sql sentence is checked for correct syntax, spelling and keywords. Also its checked for the existing of tables and columns.Depending on the database engine there may also be some syntax optimizations like converting right outer joins to left outer joins and converting subqueries to equivalent joins.

2.)   Compiling:

     This step may be optional. If the sql sentence include database specific functions or expressions, they need to be compiled to standart DML language.

3.)   Optimizing:

           The most important step is optimizing since it directly effects the performance.In the optimizing step the query processor of the database engine try to find out the most effective way to run the query. If the sql sentence is executed for the first time it looks for the available statistics and extract an execution plan(access path) for the query.This execution plan is extracted by looking to statistics of indexes, relations on tables and data load on tables. The plan may include any kind of join techniques including nested loop join, merge join  ad hash join .For details of join techniques you can look at here.

4.)   Execution:
      
           After determining an execution plan, the query is executed as communicating between the query processor and the database engine.As we said, if there is a previously created plan exists, it will be used by the query processor.That means the first 3 steps will be executed only one time.How is this plan found by the processor may vary database to database.A possible solution may be storing hash of query at the first execution, and later use this hash value to determine if there is a plan exist. The most used plans may also be cached in memory for fast lookup.

Now lets look for the benefit of prepared statement at this point.Creating a execution plan for a query may take longer than executing the query.Since its a heavy load process,after the first execution, the plan is saved for later invocations, and re-used. If you don’t use prepared statement and send the sql parameters as part of the query, each execution will be treated as a different query by the query processor because changed parameters will change sql sentence, and each time it will try to create and use an execution plan.


Application side of using prepared statement:

           Up to now we look for the database side of using prepared statements. However there are also application and application server side effects.In java side, since prepared statement is a heavy load object, it can be cached. Most of the application servers have ability of caching prepared statements.If it is cached, not every call of preparedStatement method of Connection object will create a new statement, instead it will get from the cache if there exist one.

      Another important point is closing the prepared statement after using. Most of the database implementations have both connection limit and max. number of statements limit per connection. That means if you forget to close prepared statement, and if you use more than one statement for one connection, you can get database error indicating there is no available prepared statement to open.

       Conclusion:

           Today we talk about prepared statement usage in java. It is so important using prepared statements if you want more secure and faster applications.Prepared statements are used with fine grained jdbc calls and although the first execution is relatively slow because of the impact of creating execution plan, the later invocation is as fast as using stored procedures or sqlj calls. As a related topic you may like this article about batching.

1 comment: