Using Database MetaData methods appropriately
Developing performance-oriented JDBC applications is not easy. JDBC drivers do not throw exceptions to tell you when your code is running too slow.
Part I - Using Database MetaData methods appropriately
Part II - Retrieving Only Required Data
Part III - Selecting Functions that Optimize Performance
Part IV - Managing Connections and Updates
Introduction
John Goodson is a JDBC Expert Panel Member and VP of Research and Development at DataDirect Technologies. Check out his Tech Talk on TheServerSide.com, where he discusses Database Access Technologies. |
Developing performance-oriented JDBC applications is not easy. JDBC drivers do not throw exceptions to tell you when your code is running too slow.
This series of performance tips presents some general guidelines for improving JDBC application performance that have been compiled by examining the JDBC implementations of numerous shipping JDBC applications. These guidelines include:
- Using Database MetaData methods appropriately
- Retrieve only required data
- Select functions that optimize performance
- Manage connections and updates
Following these general rules will help you solve some common JDBC system performance problems.
Using Database Metadata Methods
Because database metadata methods that generate ResultSet objects are slow compared to other JDBC methods, their frequent use can impair system performance. The guidelines in this section will help you optimize system performance when selecting and using database metadata.
Minimizing the Use of Database Metadata Methods
Compared to other JDBC methods, database metadata methods that generate ResultSet objects are relatively slow. Applications should cache information returned from result sets that generate database metadata methods so that multiple executions are not needed.
Although almost no JDBC application can be written without database metadata methods, you can improve system performance by minimizing their use. To return all result column information mandated by the JDBC specification, a JDBC driver may have to perform complex queries or multiple queries to return the necessary result set for a single call to a database metadata method. These particular elements of the SQL language are performance expensive.
Applications should cache information from database metadata methods. For example, call getTypeInfo once in the application and cache away the elements of the result set that your application depends on. It is unlikely that any application uses all elements of the result set generated by a database metadata method, so the cache of information should not be difficult to maintain.
Avoiding Search Patterns
Using null arguments or search patterns in database metadata methods results in generating time-consuming queries. In addition, network traffic potentially increases due to unwanted results. Always supply as many non-null arguments to result sets that generate database metadata methods as possible.
Because database metadata methods are slow, applications should invoke them as efficiently as possible. Many applications pass the fewest non-null arguments necessary for the function to return success.
for example:
ResultSet WSrs = WSc.getTables (null, null, "WSTable", null);
should be:
ResultSet WSrs = WSc.getTables ("cat1", "johng", "WSTable", "TABLE");
In the first getTables() call, the application probably wants to know if the table WSTable exists. Of course, a JDBC driver takes the call literally and interprets the request differently. A JDBC driver interprets the request as: return all tables, views, system tables, synonyms, temporary tables, or aliases that exist in any database schema inside any database catalog that are named ‘WSTable’.
The second call to getTables() more accurately reflects what the application wants to know. A JDBC driver interprets this request as: return all tables that exist in the ‘johng’ schema in the current catalog where the name is ‘WSTable’.
Clearly, a JDBC driver can process the second request much more efficiently than it can process the first request.
Sometimes, little information is known about the object for which you are requesting information. Any information that the application can send the driver when calling database metadata methods can result in improved performance and reliability.
Using a Dummy Query to Determine Table Characteristics
Avoid using getColumns() to determine characteristics about a table. Instead, use a dummy query with getMetadata().
Consider an application that allows the user to choose the columns that will be selected. Should the application use getColumns() to return information about the columns to the user or instead prepare a dummy query and call getMetadata()?
Case 1: GetColumns Method
ResultSet WSrc = WSc.getColumns (... "UnknownTable" ...); // This call to getColumns() will generate a query to // the system catalogs... possibly a join // which must be prepared, executed, and produce // a result set . . . WSrc.next(); string Cname = getString(4); . . . // user must retrieve N rows from the server // N = # result columns of UnknownTable // result column information has now been obtained
Case 2: GetMetadata Method
// prepare dummy query PreparedStatement WSps = WSc.prepareStatement ("SELECT * from UnknownTable WHERE 1 = 0"); // query is never executed on the server - only prepared ResultSetMetaData WSsmd=WSps.getMetaData(); int numcols = WSrsmd.getColumnCount(); ... int ctype = WSrsmd.getColumnType(n) ... // result column information has now been obtained
In both cases, a query is sent to the server. But in Case 1, the query must be prepared and executed, the result description information must be formulated, and a result set of rows must be sent to the client. In Case 2, a simple query must be prepared and only result description information must be formulated. Clearly, Case 2 is the better performing model.
To somewhat complicate this discussion, let us consider a DBMS server that does not natively support preparing a SQL statement. The performance of Case 1 does not change, but the performance of Case 2 increases slightly because the dummy query must be evaluated instead of only prepared. Because the Where clause of the query always evaluates to FALSE, the query generates no result rows and should execute without accessing table data. For this situation, method 2 still outperforms method 1.
In summary, always use result set metadata to retrieve table column information such as column names, column data types, and column precision and scale. Only use getColumns() when the requested information cannot be obtained from result set metadata (i.e. table column default values).
Look out for next month's performance tip, on 'Retrieving Only Required Data'.