31.7.3. Read-ahead

Optimized loading in JBoss is called read-ahead. This refers to the technique of reading the row for an entity being loaded, as well as the next several rows; hence the term read-ahead. JBoss implements two main strategies (on-find and on-load) to optimize the loading problem identified in the previous section. The extra data loaded during read-ahead is not immediately associated with an entity object in memory, as entities are not materialized in JBoss until actually accessed. Instead, it is stored in the preload cache where it remains until it is loaded into an entity or the end of the transaction occurs. The following sections describe the read-ahead strategies.

31.7.3.1. on-find

The on-find strategy reads additional columns when the query is invoked. If the query is on-find optimized, JBoss will execute the following query when the query is executed.
SELECT t0_g.id, t0_g.name, t0_g.nick_name, t0_g.badness 
    FROM gangster t0_g
    ORDER BY t0_g.id ASC
All of the required data would be in the preload cache, so no additional queries would need to be executed while iterating through the query results. This strategy is effective for queries that return a small amount of data, but it becomes very inefficient when trying to load a large result set into memory. The following table shows the execution of this query:

Table 31.2. on-find Optimized Query Execution

id name nick_name badness hangout organization
0 Yojimbo Bodyguard 7 0 Yakuza
1 Takeshi Master 10 1 Yakuza
2 Yuriko Four finger 4 2 Yakuza
3 Chow Killer 9 3 Triads
4 Shogi Lightning 8 4 Triads
5 Valentino Pizza-Face 4 5 Mafia
6 Toni Toothless 2 6 Mafia
7 Corleone Godfather 6 7 Mafia
The read-ahead strategy and load-group for a query is defined in the query element. If a read-ahead strategy is not declared in the query element, the strategy declared in the entity element or defaults element is used. The on-find configuration follows:
<jbosscmp-jdbc>
    <enterprise-beans>
        <entity>
            <ejb-name>GangsterEJB</ejb-name>
            <!--...-->
            <query>
                <query-method>
                    <method-name>findAll_onfind</method-name>
                    <method-params/>
                </query-method>
                <jboss-ql><![CDATA[
                 SELECT OBJECT(g)
                 FROM gangster g
                 ORDER BY g.gangsterId
                 ]]></jboss-ql>
                <read-ahead>
                    <strategy>on-find</strategy>
                    <page-size>4</page-size>
                    <eager-load-group>basic</eager-load-group>
                </read-ahead>
            </query>
        </entity>
    </enterprise-beans>
</jbosscmp-jdbc>
One problem with the on-find strategy is that it must load additional data for every entity selected. Commonly in web applications only a fixed number of results are rendered on a page. Since the preloaded data is only valid for the length of the transaction, and a transaction is limited to a single web HTTP hit, most of the preloaded data is not used. The on-load strategy discussed in the next section does not suffer from this problem.
31.7.3.1.1. Left join read ahead
Left join read ahead is an enhanced on-findread-ahead strategy. It allows you to preload in one SQL query not only fields from the base instance but also related instances which can be reached from the base instance by CMR navigation. There are no limitation for the depth of CMR navigations. There are also no limitations for cardinality of CMR fields used in navigation and relationship type mapping, i.e. both foreign key and relation-table mapping styles are supported. Let us look at some examples. Entity and relationship declarations can be found below.
31.7.3.1.2. D#findByPrimaryKey
Suppose we have an entity D. A typical SQL query generated for the findByPrimaryKey would look like this:
SELECT t0_D.id, t0_D.name FROM D t0_D WHERE t0_D.id=?
Suppose that while executing findByPrimaryKey we also want to preload two collection-valued CMR fields bs and cs.
<query>
    <query-method>
        <method-name>findByPrimaryKey</method-name>
        <method-params>
            <method-param>java.lang.Long</method-param>
        </method-params>
    </query-method>
    <jboss-ql><![CDATA[SELECT OBJECT(o) FROM D AS o WHERE o.id = ?1]]></jboss-ql>
    <read-ahead>
        <strategy>on-find</strategy>
        <page-size>4</page-size>
        <eager-load-group>basic</eager-load-group>
        <left-join cmr-field="bs" eager-load-group="basic"/>
        <left-join cmr-field="cs" eager-load-group="basic"/>
    </read-ahead>
</query>
The left-join declares the relations to be eager loaded. The generated SQL would look like this:
SELECT t0_D.id, t0_D.name,
       t1_D_bs.id, t1_D_bs.name,
       t2_D_cs.id, t2_D_cs.name
  FROM D t0_D
       LEFT OUTER JOIN B t1_D_bs ON t0_D.id=t1_D_bs.D_FK
       LEFT OUTER JOIN C t2_D_cs ON t0_D.id=t2_D_cs.D_FK
 WHERE t0_D.id=?
For the D with the specific id we preload all its related B's and C's and can access those instance loading them from the read ahead cache, not from the database.
31.7.3.1.3. D#findAll
In the same way, we could optimize the findAll method on D selects all the D's. A normal findAll query would look like this:
SELECT DISTINCT t0_o.id, t0_o.name FROM D t0_o ORDER BY t0_o.id DESC
To preload the relations, we simply need to add the left-join elements to the query.
<query>
    <query-method>
        <method-name>findAll</method-name>
    </query-method>
    <jboss-ql><![CDATA[SELECT DISTINCT OBJECT(o) FROM D AS o ORDER BY o.id DESC]]></jboss-ql>
    <read-ahead>
        <strategy>on-find</strategy>
        <page-size>4</page-size>
        <eager-load-group>basic</eager-load-group>
        <left-join cmr-field="bs" eager-load-group="basic"/>
        <left-join cmr-field="cs" eager-load-group="basic"/>
    </read-ahead>
</query>
And here is the generated SQL:
SELECT DISTINCT t0_o.id, t0_o.name,
                t1_o_bs.id, t1_o_bs.name,
                t2_o_cs.id, t2_o_cs.name
  FROM D t0_o
       LEFT OUTER JOIN B t1_o_bs ON t0_o.id=t1_o_bs.D_FK
       LEFT OUTER JOIN C t2_o_cs ON t0_o.id=t2_o_cs.D_FK
 ORDER BY t0_o.id DESC
Now the simple findAll query now preloads the related B and C objects for each D object.
31.7.3.1.4. A#findAll
Now let us look at a more complex configuration. Here we want to preload instance A along with several relations.
  • its parent (self-relation) reached from A with CMR field parent
  • the B reached from A with CMR field b, and the related C reached from B with CMR field c
  • B reached from A but this time with CMR field b2 and related to it C reached from B with CMR field c.
For reference, the standard query would be:
SELECT t0_o.id, t0_o.name FROM A t0_o ORDER BY t0_o.id DESC FOR UPDATE
The following metadata describes our preloading plan.
<query>
    <query-method>
        <method-name>findAll</method-name>
    </query-method>
    <jboss-ql><![CDATA[SELECT OBJECT(o) FROM A AS o ORDER BY o.id DESC]]></jboss-ql>
    <read-ahead>
        <strategy>on-find</strategy>
        <page-size>4</page-size>
        <eager-load-group>basic</eager-load-group>
        <left-join cmr-field="parent" eager-load-group="basic"/>
        <left-join cmr-field="b" eager-load-group="basic">
            <left-join cmr-field="c" eager-load-group="basic"/>
        </left-join>
        <left-join cmr-field="b2" eager-load-group="basic">
            <left-join cmr-field="c" eager-load-group="basic"/>
        </left-join>
    </read-ahead>
</query>
The SQL query generated would be:
SELECT t0_o.id, t0_o.name,
       t1_o_parent.id, t1_o_parent.name,
       t2_o_b.id, t2_o_b.name,
       t3_o_b_c.id, t3_o_b_c.name,
       t4_o_b2.id, t4_o_b2.name,
       t5_o_b2_c.id, t5_o_b2_c.name
  FROM A t0_o
       LEFT OUTER JOIN A t1_o_parent ON t0_o.PARENT=t1_o_parent.id
       LEFT OUTER JOIN B t2_o_b ON t0_o.B_FK=t2_o_b.id
       LEFT OUTER JOIN C t3_o_b_c ON t2_o_b.C_FK=t3_o_b_c.id
       LEFT OUTER JOIN B t4_o_b2 ON t0_o.B2_FK=t4_o_b2.id
       LEFT OUTER JOIN C t5_o_b2_c ON t4_o_b2.C_FK=t5_o_b2_c.id
 ORDER BY t0_o.id DESC FOR UPDATE
With this configuration, you can navigate CMRs from any found instance of A without an additional database load.
31.7.3.1.5. A#findMeParentGrandParent
Here is another example of self-relation. Suppose, we want to write a method that would preload an instance, its parent, grand-parent and its grand-grand-parent in one query. To do this, we would used nested left-join declaration.
<query>
    <query-method>
        <method-name>findMeParentGrandParent</method-name>
        <method-params>
            <method-param>java.lang.Long</method-param>
        </method-params>
    </query-method>
    <jboss-ql><![CDATA[SELECT OBJECT(o) FROM A AS o WHERE o.id = ?1]]></jboss-ql>
    <read-ahead>
        <strategy>on-find</strategy>
        <page-size>4</page-size>
        <eager-load-group>*</eager-load-group>
        <left-join cmr-field="parent" eager-load-group="basic">
            <left-join cmr-field="parent" eager-load-group="basic">
                <left-join cmr-field="parent" eager-load-group="basic"/>
            </left-join>
        </left-join>
    </read-ahead>
</query>
The generated SQL would be:
SELECT t0_o.id, t0_o.name, t0_o.secondName, t0_o.B_FK, t0_o.B2_FK, t0_o.PARENT,
       t1_o_parent.id, t1_o_parent.name,
       t2_o_parent_parent.id, t2_o_parent_parent.name,
       t3_o_parent_parent_parent.id, t3_o_parent_parent_parent.name
  FROM A t0_o
       LEFT OUTER JOIN A t1_o_parent ON t0_o.PARENT=t1_o_parent.id
       LEFT OUTER JOIN A t2_o_parent_parent ON t1_o_parent.PARENT=t2_o_parent_parent.id
       LEFT OUTER JOIN A t3_o_parent_parent_parent 
            ON t2_o_parent_parent.PARENT=t3_o_parent_parent_parent.id
 WHERE (t0_o.id = ?) FOR UPDATE
Note, if we remove left-join metadata we will have only
SELECT t0_o.id, t0_o.name, t0_o.secondName, t0_o.B2_FK, t0_o.PARENT FOR UPDATE