31.7.3. Read-ahead
Optimized loading in JBoss is called read-ahead. This refers to the technique of reading the row for an entity being loaded, as well as the next several rows; hence the term read-ahead. JBoss implements two main strategies (
on-find
and on-load
) to optimize the loading problem identified in the previous section. The extra data loaded during read-ahead is not immediately associated with an entity object in memory, as entities are not materialized in JBoss until actually accessed. Instead, it is stored in the preload cache where it remains until it is loaded into an entity or the end of the transaction occurs. The following sections describe the read-ahead strategies.
31.7.3.1. on-find
The
on-find
strategy reads additional columns when the query is invoked. If the query is on-find
optimized, JBoss will execute the following query when the query is executed.
SELECT t0_g.id, t0_g.name, t0_g.nick_name, t0_g.badness FROM gangster t0_g ORDER BY t0_g.id ASC
All of the required data would be in the preload cache, so no additional queries would need to be executed while iterating through the query results. This strategy is effective for queries that return a small amount of data, but it becomes very inefficient when trying to load a large result set into memory. The following table shows the execution of this query:
Table 31.2. on-find Optimized Query Execution
id | name | nick_name | badness | hangout | organization |
---|---|---|---|---|---|
0 | Yojimbo | Bodyguard | 7 | 0 | Yakuza |
1 | Takeshi | Master | 10 | 1 | Yakuza |
2 | Yuriko | Four finger | 4 | 2 | Yakuza |
3 | Chow | Killer | 9 | 3 | Triads |
4 | Shogi | Lightning | 8 | 4 | Triads |
5 | Valentino | Pizza-Face | 4 | 5 | Mafia |
6 | Toni | Toothless | 2 | 6 | Mafia |
7 | Corleone | Godfather | 6 | 7 | Mafia |
The
read-ahead
strategy and load-group
for a query is defined in the query
element. If a read-ahead
strategy is not declared in the query
element, the strategy declared in the entity
element or defaults
element is used. The on-find
configuration follows:
<jbosscmp-jdbc> <enterprise-beans> <entity> <ejb-name>GangsterEJB</ejb-name> <!--...--> <query> <query-method> <method-name>findAll_onfind</method-name> <method-params/> </query-method> <jboss-ql><![CDATA[ SELECT OBJECT(g) FROM gangster g ORDER BY g.gangsterId ]]></jboss-ql> <read-ahead> <strategy>on-find</strategy> <page-size>4</page-size> <eager-load-group>basic</eager-load-group> </read-ahead> </query> </entity> </enterprise-beans> </jbosscmp-jdbc>
One problem with the
on-find
strategy is that it must load additional data for every entity selected. Commonly in web applications only a fixed number of results are rendered on a page. Since the preloaded data is only valid for the length of the transaction, and a transaction is limited to a single web HTTP hit, most of the preloaded data is not used. The on-load
strategy discussed in the next section does not suffer from this problem.
31.7.3.1.1. Left join read ahead
Left join read ahead is an enhanced
on-find
read-ahead
strategy. It allows you to preload in one SQL query not only fields from the base instance but also related instances which can be reached from the base instance by CMR navigation. There are no limitation for the depth of CMR navigations. There are also no limitations for cardinality of CMR fields used in navigation and relationship type mapping, i.e. both foreign key and relation-table mapping styles are supported. Let us look at some examples. Entity and relationship declarations can be found below.
31.7.3.1.2. D#findByPrimaryKey
Suppose we have an entity
D
. A typical SQL query generated for the findByPrimaryKey
would look like this:
SELECT t0_D.id, t0_D.name FROM D t0_D WHERE t0_D.id=?
Suppose that while executing
findByPrimaryKey
we also want to preload two collection-valued CMR fields bs
and cs
.
<query> <query-method> <method-name>findByPrimaryKey</method-name> <method-params> <method-param>java.lang.Long</method-param> </method-params> </query-method> <jboss-ql><![CDATA[SELECT OBJECT(o) FROM D AS o WHERE o.id = ?1]]></jboss-ql> <read-ahead> <strategy>on-find</strategy> <page-size>4</page-size> <eager-load-group>basic</eager-load-group> <left-join cmr-field="bs" eager-load-group="basic"/> <left-join cmr-field="cs" eager-load-group="basic"/> </read-ahead> </query>
The
left-join
declares the relations to be eager loaded. The generated SQL would look like this:
SELECT t0_D.id, t0_D.name, t1_D_bs.id, t1_D_bs.name, t2_D_cs.id, t2_D_cs.name FROM D t0_D LEFT OUTER JOIN B t1_D_bs ON t0_D.id=t1_D_bs.D_FK LEFT OUTER JOIN C t2_D_cs ON t0_D.id=t2_D_cs.D_FK WHERE t0_D.id=?
For the
D
with the specific id we preload all its related B
's and C
's and can access those instance loading them from the read ahead cache, not from the database.
31.7.3.1.3. D#findAll
In the same way, we could optimize the
findAll
method on D
selects all the D
's. A normal findAll query would look like this:
SELECT DISTINCT t0_o.id, t0_o.name FROM D t0_o ORDER BY t0_o.id DESC
To preload the relations, we simply need to add the
left-join
elements to the query.
<query> <query-method> <method-name>findAll</method-name> </query-method> <jboss-ql><![CDATA[SELECT DISTINCT OBJECT(o) FROM D AS o ORDER BY o.id DESC]]></jboss-ql> <read-ahead> <strategy>on-find</strategy> <page-size>4</page-size> <eager-load-group>basic</eager-load-group> <left-join cmr-field="bs" eager-load-group="basic"/> <left-join cmr-field="cs" eager-load-group="basic"/> </read-ahead> </query>
And here is the generated SQL:
SELECT DISTINCT t0_o.id, t0_o.name, t1_o_bs.id, t1_o_bs.name, t2_o_cs.id, t2_o_cs.name FROM D t0_o LEFT OUTER JOIN B t1_o_bs ON t0_o.id=t1_o_bs.D_FK LEFT OUTER JOIN C t2_o_cs ON t0_o.id=t2_o_cs.D_FK ORDER BY t0_o.id DESC
Now the simple
findAll
query now preloads the related B
and C
objects for each D
object.
31.7.3.1.4. A#findAll
Now let us look at a more complex configuration. Here we want to preload instance
A
along with several relations.
- its parent (self-relation) reached from
A
with CMR fieldparent
- the
B
reached fromA
with CMR fieldb
, and the relatedC
reached fromB
with CMR fieldc
B
reached fromA
but this time with CMR fieldb2
and related to itC
reached from B with CMR field c.
For reference, the standard query would be:
SELECT t0_o.id, t0_o.name FROM A t0_o ORDER BY t0_o.id DESC FOR UPDATE
The following metadata describes our preloading plan.
<query> <query-method> <method-name>findAll</method-name> </query-method> <jboss-ql><![CDATA[SELECT OBJECT(o) FROM A AS o ORDER BY o.id DESC]]></jboss-ql> <read-ahead> <strategy>on-find</strategy> <page-size>4</page-size> <eager-load-group>basic</eager-load-group> <left-join cmr-field="parent" eager-load-group="basic"/> <left-join cmr-field="b" eager-load-group="basic"> <left-join cmr-field="c" eager-load-group="basic"/> </left-join> <left-join cmr-field="b2" eager-load-group="basic"> <left-join cmr-field="c" eager-load-group="basic"/> </left-join> </read-ahead> </query>
The SQL query generated would be:
SELECT t0_o.id, t0_o.name, t1_o_parent.id, t1_o_parent.name, t2_o_b.id, t2_o_b.name, t3_o_b_c.id, t3_o_b_c.name, t4_o_b2.id, t4_o_b2.name, t5_o_b2_c.id, t5_o_b2_c.name FROM A t0_o LEFT OUTER JOIN A t1_o_parent ON t0_o.PARENT=t1_o_parent.id LEFT OUTER JOIN B t2_o_b ON t0_o.B_FK=t2_o_b.id LEFT OUTER JOIN C t3_o_b_c ON t2_o_b.C_FK=t3_o_b_c.id LEFT OUTER JOIN B t4_o_b2 ON t0_o.B2_FK=t4_o_b2.id LEFT OUTER JOIN C t5_o_b2_c ON t4_o_b2.C_FK=t5_o_b2_c.id ORDER BY t0_o.id DESC FOR UPDATE
With this configuration, you can navigate CMRs from any found instance of
A
without an additional database load.
31.7.3.1.5. A#findMeParentGrandParent
Here is another example of self-relation. Suppose, we want to write a method that would preload an instance, its parent, grand-parent and its grand-grand-parent in one query. To do this, we would used nested
left-join
declaration.
<query> <query-method> <method-name>findMeParentGrandParent</method-name> <method-params> <method-param>java.lang.Long</method-param> </method-params> </query-method> <jboss-ql><![CDATA[SELECT OBJECT(o) FROM A AS o WHERE o.id = ?1]]></jboss-ql> <read-ahead> <strategy>on-find</strategy> <page-size>4</page-size> <eager-load-group>*</eager-load-group> <left-join cmr-field="parent" eager-load-group="basic"> <left-join cmr-field="parent" eager-load-group="basic"> <left-join cmr-field="parent" eager-load-group="basic"/> </left-join> </left-join> </read-ahead> </query>
The generated SQL would be:
SELECT t0_o.id, t0_o.name, t0_o.secondName, t0_o.B_FK, t0_o.B2_FK, t0_o.PARENT, t1_o_parent.id, t1_o_parent.name, t2_o_parent_parent.id, t2_o_parent_parent.name, t3_o_parent_parent_parent.id, t3_o_parent_parent_parent.name FROM A t0_o LEFT OUTER JOIN A t1_o_parent ON t0_o.PARENT=t1_o_parent.id LEFT OUTER JOIN A t2_o_parent_parent ON t1_o_parent.PARENT=t2_o_parent_parent.id LEFT OUTER JOIN A t3_o_parent_parent_parent ON t2_o_parent_parent.PARENT=t3_o_parent_parent_parent.id WHERE (t0_o.id = ?) FOR UPDATE
Note, if we remove
left-join
metadata we will have only
SELECT t0_o.id, t0_o.name, t0_o.secondName, t0_o.B2_FK, t0_o.PARENT FOR UPDATE