Red Hat Training

A Red Hat training course is available for Red Hat JBoss Data Virtualization

Chapter 7. Query Language Grammars

The hierarchical database supports multiple query languages, including all four languages defined in the JCR 1.0 specification and JCR 2.0 specification .

7.1. JCR-SQL2

The JCR-SQL2 query language is defined by the JCR 2.0 specification as a way to express queries using strings that are similar to SQL. This query language is an improvement over the earlier JCR-SQL language, providing among other things far richer specifications of joins and criteria.

7.1.1. Extensions to JCR-SQL2

The hierarchical database includes full support for the complete JCR-SQL2 query language defined by the specification. However, there are several extensions provided to make it even more powerful:
  • Support for the FULL OUTER JOIN and CROSS JOIN join types, in addition to the LEFT OUTER JOIN, RIGHT OUTER JOIN and INNER JOIN types defined by JCR-SQL2. Note that JOIN is a shorthand for INNER JOIN. For detail, see the grammar for "joins" .
  • Support for the UNION , INTERSECT , and EXCEPT set operations on multiple result sets to form a single result set. As with standard SQL, the result sets being combined must have the same columns. The UNION operator combines the rows from two result sets, the INTERSECT operator returns the difference between two result sets, and the EXCEPT operator returns the rows that are common to two result sets. Duplicate rows are removed unless the operator is followed by the ALL keyword. For detail, see the grammar for "set queries" .
  • Removal of duplicate rows in the results, using SELECT DISTINCT ... expression. For detail, see the grammar for queries .
  • Limiting the number of rows in the result set with the LIMIT count clause, where count is the maximum number of rows that should be returned. This clause may optionally be followed by the OFFSET number clause to specify the number of initial rows that should be skipped. For detail, see the grammar for "limits and offsets" .
  • Additional dynamic operands DEPTH(selectorName) and PATH(selectorName) that enable placing constraints on the node depth and path, respectively. These dynamic operands can be used in a manner similar to NAME(selectorName) and LOCALNAME(selectorName) that are defined by JCR-SQL2. Note in each of these cases, the selectorName is optional if there is only one selector in the query. For detail, see the grammar for "dynamic operands" .
  • Additional dynamic operand REFERENCE(selectorName.propertyName) and REFERENCE(selectorName) that enables placing constraints on one or any of the reference properties, respectively, and which can be used in a manner similar to the standard dynamic operand PropertyValue(selectorName.propertyName). Note in each of these cases, the selectorName is optional if there is only one selector in the query, and that the propertyName can be excluded if the constraint should apply to all reference properties. For detail, see the grammar for "dynamic operands" .
  • Support for the IN and NOT IN clauses to more easily and concisely supply multiple of discrete static operands. For example, WHERE ... [my:type].[prop1] IN (3,5,7,10,11,50) .... For detail, see the grammar for "set constraints" .
  • Support for the BETWEEN clause to more easily and concisely supply a range of discrete operands. For example, WHERE ... [my:type].[prop1] BETWEEN 3 EXCLUSIVE AND 10 .... For detail, see the grammar for "between constraints" .
  • Support for simple arithmetic in numeric-based criteria and order-by clauses. For example, ... WHERE SCORE(type1) + SCORE(type2) > 1.0 or ... ORDER BY (SCORE(type1) * SCORE(type2)) ASC, LENGTH(type2.property1) DESC. For detail, see the grammar for "order-by clauses" .
  • Support for (non-correlated) subqueries in the WHERE clause, wherever a static operand can be used. Subqueries can even be used within another subquery. All subqueries must return a single column, and each row's single value will be treated as a literal value. If the subquery is used in a clause that expects a single value (e.g., in a comparison), only the subquery's first row will be used. If the subquery is used in a clause that allows multiple values (e.g., IN (...)), then all of the subquery's rows will be used. For example, this expression WHERE ... [my:type].[prop1] IN ( SELECT [my:prop2] FROM [my:type2] WHERE [my:prop3] < '1000' ) AND ... will use the results of the subquery as the literal values in the IN clause. See the "subqueries" section for more information.
  • Support for several pseudo-columns (jcr:path, jcr:score, jcr:name, mode:localName, and mode:depth) that can be used in the SELECT , equijoin, and WHERE clauses. These pseudo-columns make it possible to return location-related and score information within the QueryResult 's rows. They also make queries look more like SQL, and thus may be more friendly and easier to use in existing SQL-aware client applications. See the "pseudo-columns" section for more information.
  • Support for NOT LIKE as an operator in comparison criteria, and which is equivalent to wrapping a LIKE comparison criteria in a NOT(...) clause.

7.1.2. Extended JCR-SQL2 Grammar

The full grammar for the hierarchical database's extended JCR-SQL2 support is a strict superset of that defined by the JCR 2.0 specification. In other words, Any JCR-SQL2 query that uses the standard grammar it supported, as well as queries that make use of the provided extensions.

7.1.2.1. Queries

The top-level rule for the extended JCR-SQL2 grammar is QueryCommand , which consists of both Query and SetQuery :
QueryCommand ::= Query | SetQuery

SetQuery ::= Query ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query
                 { ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query }

Query ::= 'SELECT' ['DISTINCT'] columns
          'FROM' Source
          ['WHERE' Constraint]
          ['ORDER BY' orderings]
          [Limit]
The hierarchical database adds the concept of a set query, which is a query that performs a union , intersection , or complement of the results of two other queries. Set queries are common in SQL (which is essentially a set manipulation language) and are a very useful tool that would otherwise require significant processing of the results of multiple queries by the application. By supporting set queries, the application merely needs to declare that set operation be performed, and the hierarchical database will perform all the work before returning the results.
There is also the ability to use SELECT DISTINCT, which eliminates duplicate rows in a manner similar to SQL.

7.1.2.2. Source

A source is a named set of tuples, which in the hierarchical database corresponds to the nodes of a particular named node type. In other words, a source is equivalent to a table in a relational database. The available columns of a source are the named properties declared on the node type.
In the JCR-SQL2 grammar, a source is either a selector (a named node type) or a join specification:
Source ::= Selector | Join

Selector ::= nodeTypeName ['AS' selectorName]

nodeTypeName ::= Name
selectorName ::= /* A string that contains only SQL-legal characters,
                    and which can be used elsewhere in the query to
                    refer to the selector. */

7.1.2.3. Joins

The JCR 2.0 specification does include joins in the standard JCR-SQL2 grammar, though the only defined types of joins included inner , left outer , and right outer joins. Because SQL also defines the useful full outer and cross join types, the hierarchical database adds support for these.
Join ::= left [JoinType] 'JOIN' right 'ON' JoinCondition
         /* If JoinType is omitted INNER is assumed. */

left ::= Source
right ::= Source

JoinType ::= Inner | LeftOuter | RightOuter | FullOuter | Cross

Inner ::= 'INNER' ['JOIN']

LeftOuter ::= 'LEFT JOIN' | 'OUTER JOIN' | 'LEFT OUTER JOIN'

RightOuter ::= 'RIGHT OUTER' ['JOIN']

RightOuter ::= 'FULL OUTER' ['JOIN']

RightOuter ::= 'CROSS' ['JOIN']

JoinCondition ::= EquiJoinCondition |
                  SameNodeJoinCondition |
                  ChildNodeJoinCondition |
                  DescendantNodeJoinCondition
Each of the four kinds of join conditions are described below.
join condition
An equijoin is a join that uses only equality comparisons in the join predicate (or join condition). Using any other operators (e.g., '<' or '!=') in the join condition disqualifies a query from being an equi-join.
Therefore, the rules for the equi-join condition are as follows:
EquiJoinCondition ::= selector1Name'.'property1Name
                      '=' selector2Name'.'property2Name

selector1Name ::= selectorName
selector2Name ::= selectorName
property1Name ::= propertyName
property2Name ::= propertyName

propertyName ::= Name
where the node type referenced by the selector identified in the query with the selector1Name must contain the property given by the property1Name literal, and similarly the node type referenced by the selector identified in the query with the selector2Name must contain the property given by the property2Name literal.
See also the "name rule" .
node join condition
An identity join is a special case of an equijoin, where the compared properties are node identifiers. Thus the join condition of an identity join constrains the node on one sides of the join to be the same node on the other side of the join. The standard JCR-SQL2 grammar defines a special function that makes this a little easier to use:
SameNodeJoinCondition ::= 'ISSAMENODE(' selector1Name
                                    ',' selector2Name
                                   [',' selector2Path] ')'

selector1Name ::= selectorName
selector2Name ::= selectorName
selector2Path ::= Path
See also the "path rule" .
Child-node join condition
A child-node join is one where the join condition constrains the node on the left side of the join to be a child of the node on the right side of the join. The standard JCR-SQL2 grammar defines a special function that makes it easier to specify such join conditions:
ChildNodeJoinCondition ::= 'ISCHILDNODE('
                            childSelectorName ','
                            parentSelectorName ')'

childSelectorName ::= selectorName
parentSelectorName ::= selectorName
Descendant-node join condition
A descendant-node join is one where the join condition constrains the node on the left side of the join to be a descendant of the node on the right side of the join. The standard JCR-SQL2 grammar defines a special function that makes it easier to specify such join conditions:
DescendantNodeJoinCondition ::= 'ISDESCENDANTNODE('
                                 descendantSelectorName ','
                                 ancestorSelectorName ')'

descendantSelectorName ::= selectorName
ancestorSelectorName ::= selectorName

7.1.2.4. Constraints

The "query rule" included a WHERE clause that can define multiple constraints on the nodes included in the results. The standard JCR-SQL2 grammar defined several such constraints, including and , or , not , comparison , property existence , full-text search , same-node , child-node , and descendant-node constraints. The hierarchical database supports all of these, but adds two others: between and set constraints.
Constraint ::= ConstraintItem | '(' ConstraintItem ')'

ConstraintItem ::= And | Or | Not
                   Comparison | Between |
                   PropertyExistence |
                   SetConstraint |
                   FullTextSearch |
                   SameNode |
                   ChildNode |
                   DescendantNode
Each of these types of constraints are described below.
And constraint
An and constraint stipulates that a node (or record or tuple) is included only if two other constraints are both true.
And ::= constraint1 'AND' constraint2

constraint1 ::= Constraint
constraint2 ::= Constraint
Or constraint
An or constraint stipulates that a node (or record or tuple) is included if either of two other constraints are true.
Or ::= constraint1 'OR' constraint2

constraint1 ::= Constraint
constraint2 ::= Constraint
Not constraint
The not qualifier will negate another constraint, requiring that a node (or record or tuple) is included if the other constraint is not true.
Not ::= 'NOT' constraint

constraint ::= Constraint
Comparison constraint
A comparison constraint requires that the value for a node described by the dynamic operand on the left side of the operator is to be compared to a static literal value. The term "dynamic operand" is used in the JCR-SQL2 grammar because its value can only be determined during query evaluation.
Comparison ::= DynamicOperand Operator StaticOperand

Operator ::= '=' | '!=' | '<' | '<=' | '>' | '>=' | 'LIKE' | 'NOT LIKE'
The behavior of the operators is dictated by the JCR 2.0 specification and matches how Value objects are compared:
  • If the DynamicOperand evaluates to null, the constraint is not satisfied.
  • If the '=' operator is used, the value that the DynamicOperand evaluates to must equal the StaticOperand value for the constraint to be satisfied.
  • If the '!=' operator is used, the value that the DynamicOperand evaluates to must not equal the StaticOperand value for the constraint to be satisfied.
  • If the '<' operator is used, the value that the DynamicOperand evaluates to must be less than the StaticOperand value for the constraint to be satisfied.
  • If the '<=' operator is used, the value that the DynamicOperand evaluates to must be less than or equal to the StaticOperand value for the constraint to be satisfied.
  • If the '>' operator is used, the value that the DynamicOperand evaluates to must be greater than the StaticOperand value for the constraint to be satisfied.
  • If the '>=' operator is used, the value that the DynamicOperand evaluates to must be greater than or equal to the StaticOperand value for the constraint to be satisfied.
  • If the 'LIKE' operator is used, the constraint is only satisfied if the value that the DynamicOperand evaluates to match the pattern specified by the string literal StaticOperand , where in the pattern:
    • the character '%' matches zero or more characters, and
    • the character '_' (underscore) matches exactly one character, and
    • the string '\x' matches the character 'x', and
    • all other characters match themselves
  • If the NOT LIKE operator is used, the constraint is only satisfied if the value that the DynamicOperand evaluates to not match the pattern specified by the string literal StaticOperand , where in the pattern:
    • the character '%' matches zero or more characters, and
    • the character '_' (underscore) matches exactly one character, and
    • the string '\x' matches the character 'x', and
    • all other characters match themselves
Also, note that, unlike SQL, the standard JCR-SQL2 grammar does not allow the left-hand side and right-hand sides of a comparison constraint to be swapped.
Between constraint
The between constraint is one of the extensions defined by the hierarchical database, and allows a query to more easily represent a range of static values than using only the constraints available in the standard JCR-SQL2 grammar. The between constraint is based on the similar expression in SQL.
Between ::= DynamicOperand ['NOT'] 'BETWEEN'
                lowerBound ['EXCLUSIVE'] 'AND'
                upperBound ['EXCLUSIVE']

lowerBound ::= StaticOperand
upperBound ::= StaticOperand
Property existence constraint
A property existence constraint stipulates that a property does indeed exist on a node that is of the node type specified by the named selector. the hierarchical database does allow the NOT qualifier to be excluded, which turns the constraint into a stipulation that the property does not exist on the node.
PropertyExistence ::= [selectorName'.']propertyName 'IS' ['NOT'] 'NULL'

                      /* If only one selector exists in this query,
                         explicit specification of the selectorName
                         preceding the propertyName is optional */
Set constraint
Like the "between constraint", the set constraint is an extension to the standard JCR-SQL2 grammar that allows what would normally be a complicated combination of standard JCR-SQL2 constraints to be more easily represented with a single, simple expression. Again, this constraint is patterned after the similar expression in SQL.
SetConstraint ::= [selectorName '.']propertyName ['NOT'] 'IN'
                    '(' firstStaticOperand
                        {',' additionalStaticOperand }
                    ')'

                   /* If only one selector exists in this query,
                      explicit specification of the selectorName
                      preceding the propertyName is optional */

firstStaticOperand ::= StaticOperand
additionalStaticOperand ::= StaticOperand
Note that multiple static operands can be included in the comma-separated list.
Although this rule seems complicated, it is actually very straightforward. The following query selects all the properties defined on the acme:taggable node type, returning only those "taggable" nodes with a acme:tagname value of "tag1", "tag2", "tag3", or "tag4":
SELECT * FROM [acme:taggable] as tagged
WHERE tagged.[acme:tagName] IN ('tag1','tag2','tag3','tag4')
Even this trivial query is quite a bit simpler and easier to understand than if the query had used only the constraints defined by the standard JCR-SQL2 grammar:
SELECT * FROM [acme:taggable] as tagged
WHERE tagged.[acme:tagName] = 'tag1'
   OR tagged.[acme:tagName] = 'tag2'
   OR tagged.[acme:tagName] = 'tag3'
   OR tagged.[acme:tagName] = 'tag4'
Imagine how complicated a query might be with multiple joins, multiple criteria, and many values to be compared for one or several different properties.
text search constraint
FullTextSearch ::= 'CONTAINS('
                      ([selectorName'.']propertyName | selectorName'.*')
                      ',' ''' fullTextSearchExpression''' ')'

                   /* If only one selector exists in this query,
                      explicit specification of the selectorName
                      preceding the propertyName is optional */

fullTextSearchExpression ::= FulltextSearch
The full-text search expression is a string literal that adheres to the "full-text search" grammar described below.
An example query selects all the properties defined on the acme:taggable node type, returning only those "taggable" nodes with a acme:tagname value that contains the "foo" term within the value:
SELECT * FROM [acme:taggable] as tagged
WHERE CONTAINS(tagged.[acme:tagName],'foo')
node constraint
The same-node constraint stipulates that the node appearing in the selector with the given name has a path that matches the literal path provided.
SameNode ::= 'ISSAMENODE(' [selectorName ','] Path ')'

              /* If only one selector exists in this query,
                 explicit specification of the selectorName
                 preceding the propertyName is optional */
Because this standard constraint clause is not really like traditional SQL, the hierarchical database defines a jcr:path "pseudo-column" that can be used in "comparison constraints" and that allows for using other comparison operators, including LIKE.
Child-node constraint
The child-node constraint stipulates that the node appearing in the selector with the given name is a child of a node with a path that matches the literal path provided.
ChildNode ::= 'ISCHILDNODE(' [selectorName ','] Path ')'

              /* If only one selector exists in this query,
                 explicit specification of the selectorName
                 preceding the propertyName is optional */
See also the jcr:path "pseudo-column" that can be used in "comparison constraints" and that allows for using other comparison operators, including LIKE. And because the right hand side (i.e., static operand) of a LIKE expression can involve wildcards, it may be easier and more understandable to use the pseudo-column.
Descendant-node constraint
The descendant-node constraint stipulates that the node appearing in the selector with the given name is a descendant of a node with a path that matches the literal path provided.
DescendantNode ::= 'ISDESCENDANTNODE(' [selectorName ','] Path ')'

              /* If only one selector exists in this query,
                 explicit specification of the selectorName
                 preceding the propertyName is optional */
See also the jcr:path "pseudo-column" that can be used in "comparison constraints" and that allows for using other comparison operators, including LIKE. And because the right hand side (i.e., static operand) of a LIKE expression can involve wildcards, it may be easier and more understandable to use the pseudo-column.

7.1.2.5. Path and Name

Many of the rules above have used paths and names, and the rules for these are defined as follows:
Name ::= '[' quotedName ']' | '[' simpleName ']' | simpleName

quotedName ::= /* A JCR Name (see the JCR specification) */
simpleName ::= /* A JCR Name that contains only SQL-legal
                  characters (namely letters, digits, and underscore) */

Path ::= '[' quotedPath ']' | '[' simplePath ']' | simplePath

quotedPath ::= /* A JCR Path that contains non-SQL-legal characters */
simplePath ::= /* A JCR Path (rather Name) that contains only SQL-legal
                    characters (namely letters, digits, and underscore) */
Note that JCR-SQL2 surrounds identifiers with square brackets (e.g., '[' and ']'), allowing names to contain a ':' character needed with namespaced names. If the names or paths only contain valid SQL characters, then they do not need to be quoted.

7.1.2.6. Static Operand

In the standard JCR-SQL2 grammar, a static operand appears on the right-hand side of an operator, and represents an expression whose value can be determined by static analysis of the query (e.g., when the query is parsed ). In particular, a static operand in the standard JCR-SQL2 grammar comprised of either a literal value or a variable.
In SQL, however, the expression that appears on the right-hand side of an operator is not always able to be determined at query parse time. An example is a subquery, which appears on the right hand side but obviously can only be evaluated into values during query execution time. Since standard JCR-SQL2 does not include any such features, the term "static operand" is technically valid.
In addition to literal values and variables, the hierarchical database also supports "subqueries" appearing on the right-hand side of an operator. So this grammar continues to use the "static operand" term for easy comparison with the standard JCR-SQL2 grammar, but the term has a different (and expanded) semantic than in the standard grammar.
Therefore, the rules for what the hierarchical database allows on the right-hand side of an operator in a constraint is as follows:
StaticOperand ::= Literal | BindVariableValue | Subquery

Literal ::= CastLiteral | UncastLiteral

CastLiteral ::= 'CAST(' UncastLiteral ' AS ' PropertyType ')'

PropertyType ::= 'STRING' | 'BINARY' | 'DATE' |
                 'LONG' | 'DOUBLE' | 'DECIMAL' |
                 'BOOLEAN' | 'NAME' | 'PATH' |
                 'REFERENCE' | 'WEAKREFERENCE' |
                 'URI'

UncastLiteral ::= UnquotedLiteral |
                  ''' UnquotedLiteral ''' |
                  '"' UnquotedLiteral '"'

UnquotedLiteral ::= /* String form of a JCR Value,
                       as defined in the JCR specification */
Bind variable
The standard JCR-SQL2 grammar supports using variable names within a query, where the values for those variables are bound to the Query object before execution. In the query, the variable names are prefixed with a '$' character and are otherwise normal JCR name:
BindVariableValue ::= '$'bindVariableName

bindVariableName ::= /* A string that conforms to the JCR Name syntax,
                        though the prefix does not need to be a
                        registered namespace prefix. */
So, consider this simple query that selects all the properties defined on the acme:taggable node type, and that returns only those "taggable" nodes with a acme:tagname value that matches the value of the tagValue variable:
SELECT * FROM [acme:taggable] as tagged
WHERE tagged.[acme:tagName] = $tagValue
This query could be evaluated using the JCR API as follows:
// Obtain the query manager for the session via the workspace ...
javax.jcr.Session session = // ...
javax.jcr.query.QueryManager mgr = session.getWorkspace().getQueryManager();

// Create a query object ...
String language = ...
String expression = ...
javax.jcr.query.Query query = queryManager.createQuery(expression,language);

// Bind a value to the variable ...
Value tag = session.getValueFactory().create("foo");
query.bindVariable("tagValue",tag);

// Execute the query and get the results ...
javax.jcr.query.QueryResult result = query.execute();
Obviously multiple variables can be used in a query expression, but a value must be bound to every variable before the Query object can be executed.
Subquery
The standard JCR-SQL2 grammar does not support subqueries. But subqueries are such a useful feature, so the hierarchical database supports using multiple subqueries within a single query. In fact, subqueries are nothing more than a QueryCommand , which if you'll remember is the top-level rule in the grammar. That means that subqueries can be any query, and you can even include subqueries within a subquery!
Subquery ::= '(' QueryCommand  ')' |
             QueryCommand
Strictly speaking, the hierarchical database only supports non-correlated subqueries, which means that they can actually be evaluated independently (outside the context of the containing query).
Additionally, because subqueries appear on the right-hand side of an operator, all subqueries must return a single column, and each row's single value will be treated as a literal value. If the subquery is used in a clause that expects a single value (e.g., in a comparison), only the subquery's first row will be used. If the subquery is used in a clause that allows multiple values (e.g., IN (...)), then all of the subquery's rows will be used.
For example, in the following query fragment, the first value in each row of the subquery's results will be used within the IN clause of the outer query:
WHERE ... [my:type].[prop1] IN
   ( SELECT [my:prop2] FROM [my:type2]
     WHERE  [my:prop3] < '1000' )
AND ...
However, changing the IN clause to a comparison results in only the first value in the first row of the subquery's results being using in the comparison criteria:
WHERE ... [my:type].[prop1] =
   ( SELECT [my:prop2] FROM [my:type2]
     WHERE  [my:prop3] < '1000' )
AND ...

7.1.2.7. Dynamic Operand

In various constraints described above, the dynamic operand appears on the left-hand side of an operator, and signifies that the values can only be determined when the query is evaluated.
The standard JCR-SQL2 grammar defines seven kinds of dynamic operands: property value , length , node name , node local name , full-text search score , lowercase , and uppercase .
The hierarchical database supports all these types, but adds support for four more: reference value , node path , node depth , and simple arithmetic clauses . The hierarchical database also allows the dynamic operand to be surrounded by parentheses, which is sometimes convenient for complex queries.
The DynamicOperand rule in the extended grammar is:
DynamicOperand ::= PropertyValue | ReferenceValue |
                   Length | NodeName | NodeLocalName |
                   NodePath | NodeDepth |
                   FullTextSearchScore |
                   LowerCase | UpperCase |
                   Arithmetic |
                   '(' DynamicOperand ')'
Each of these types of dynamic operands is described below.
Property value operand
The property value operand always evaluates to the value(s) of the specified property on the selector.
PropertyValue ::= [selectorName'.'] propertyName

                /* If only one selector exists in this query,
                   explicit specification of the selectorName
                   preceding the propertyName is optional. */
Note that if the property is multi-valued, the constraint will be satisfied if any of the property values works with the constraint. For example, if the acme:tagNames property is a multi-valued property declared on the acme:taggable node type, then the following query will finds all acme:taggable nodes that has "foo" for at least one of the values of the acme:tagNames property:
SELECT * FROM [acme:taggable] as tagged
WHERE tagged.[acme:tagNames] = 'foo'
Reference value operand
One of the extensions is to support the a REFERENCE(...) dynamic operand, which enables placing constraints on one or any of the reference properties.
ReferenceValue ::= 'REFERENCE(' selectorName '.' propertyName ')' |
                   'REFERENCE(' selectorName ')' |
                   'REFERENCE()' |

                /* If only one selector exists in this query,
                   explicit specification of the selectorName
                   preceding the propertyName is optional.
                   Also, the property name may be excluded
                   if the constraint should apply to any
                   reference property.*/
The REFERENCE operand always evaluates to the identifier of the referenced nodes in one or all of the REFERENCE properties. Thus, all of the REFERENCE operands should be used with a StaticOperand that also evaluates to identifiers.
The REFERENCE() operand (with no selector name and no property name) evaluates to the identifiers of the nodes referenced by all of reference properties on the node in the only selector. The REFERENCE(selectorName) works the same way, but must be used if there is more than one selector in the query. Finally, the REFERENCE(selectorName.propertyName) evaluates to the identifiers of nodes referenced by the propertyName reference property on the nodes in the named selector.
For example, here is a query that finds all nodes that reference a set of nodes for which we already know the identifiers, id1, id2, and id3.
SELECT * FROM [nt:base]
WHERE REFERENCE() IN ('id1','id2','id3')
This operand works really well with subqueries or variables for the right-hand side. For example, here is a query finds all nodes that reference any of the nodes in the subgraph below the /foo/bar/baz node, where a subquery is used to find all nodes in the subgraph:
SELECT * FROM [nt:base]
WHERE REFERENCE() IN (
  SELECT [jcr:uuid] FROM [nt:base] AS refd
  WHERE ISDESCENDANT(refd,'/foo/bar/baz')
  )
This kind of query is impossible to do using standard JCR-SQL2 features, and shows some of the power of the extensions to JCR-SQL2.
Length operand
The length operand evaluates to the length (or lengths, if multi-valued) of a property. The length is defined to be:
  • for a BINARY value, the number of bytes in the value, or
  • for all other value types, the number of characters of the string resulting from a conversion of the value to a string.
The rule for the length operand is:
Length ::= 'LENGTH(' PropertyValue ')'
Node name operand
The node name operand always evaluates to the prefixed name of the node given by the supplied selector:
NodeName ::= 'NAME(' [selectorName] ')'

                /* If only one selector exists in this query,
                   explicit specification of the selectorName
                   is optional */
See also the jcr:name "pseudo-column", which enables accessing the JCR name of any node as if the name were a regular property on any node.
Node local name operand
The node name operand always evaluates to the local name of the node given by the supplied selector:
NodeLocalName ::= 'LOCALNAME(' [selectorName] ')'

                /* If only one selector exists in this query,
                   explicit specification of the selectorName
                   is optional */
See also the mode:localName "pseudo-column", which enables accessing the local name of any node as if the local name were a regular property.
Node depth operand
The node depth operand is an extension to the standard set of dynamic operands specific to the hierarchical database, and evaluates to the integer depth of the node given by the supplied selector. The depth of a node is defined to be the number of segments in the node's path. For example, the depth of the root node is 0, whereas the depth of the node at /foo/bar/baz is 3.
NodeDepth ::= 'DEPTH(' [selectorName] ')'

                /* If only one selector exists in this query,
                   explicit specification of the selectorName
                   is optional */
See also the mode:depth "pseudo-column", which enables accessing the depth of any node as if the depth were a regular property.
Node path operand
The node path operand is an extension to the standard set of dynamic operands specific to the hierarchical database, and evaluates to the path of the node given by the supplied selector.
NodePath ::= 'PATH(' [selectorName] ')'

                /* If only one selector exists in this query,
                   explicit specification of the selectorName
                   is optional */
See also the jcr:path "pseudo-column", which enables accessing the path of any node as if the path were a regular property.
Full text search score operand
The full-text search score operand evaluates to a DOUBLE value equal to the full-text search score of a node. The full-text search score ranks a selector's nodes by their relevance to the fullTextSearchExpression specified in a [FullTextSearch|#Fulltextsearchconstraint . The magnitude of the scores are implementation specific, but most implementations will produce higher scores with more relevant matching and lower scores for less-relevant matching.
FullTextSearchScore ::= 'SCORE(' [selectorName] ')'
                /* If only one selector exists in this query,
                   explicit specification of the selectorName
                   is optional */

See also the jcr:score "pseudo-column", which enables accessing the score of any node as if the score were a regular property.
Lowercase operand
The lowercase operand evaluates to the lower-case string value (or values, if multi-valued) of operand. If the operand does not evaluate to a string value, its value is first converted to a string.
LowerCase ::= 'LOWER(' DynamicOperand ')'
Uppercase operand
The uppercase operand evaluates to the upper-case string value (or values, if multi-valued) of operand. If the operand does not evaluate to a string value, its value is first converted to a string.
LowerCase ::= 'LOWER(' DynamicOperand ')'
Arithmetic operand
The arithmetic operand is an extension to the standard JCR-SQL2 grammar specific to the hierarchical database. It allows two other dynamic operands that evaluate to numeric values to be numerically combined using addition, subtraction, multiplication, or division.
Arithmetic ::= DynamicOperand ('+'|'-'|'*'|'/') DynamicOperand
For example, the following query restricts the results such that the sum of the score of nodes originating from separate selectors is greater than 1.0:
SELECT * FROM [acme:type1] AS type1
         JOIN [acme:type2] as type2 ON type1.prop1 < type2.prop2
         WHERE SCORE(type1) + SCORE(type2) > 1.0
So although it is possible to use in the WHERE clause, it is more likely to be used in the order-by clauses . For example, the following query orders the results based upon the difference in the scores of nodes in the two selectors:
SELECT * FROM [acme:type1] AS type1
         JOIN [acme:type2] as type2 ON type1.prop1 < type2.prop2
         ORDER BY ( SCORE(type1) - SCORE(type2) ) ASC,
                  LENGTH(type2.prop3) DESC

7.1.2.8. Ordering

The ORDER BY clause defined by the standard JCR-SQL2 grammar allows the order of the results to be dictated by the values evaluated at execution time based upon one or more "dynamic operands" . The rule for the expression is as follows:
orderings ::= Ordering {',' Ordering}

Ordering ::= DynamicOperand [Order]

Order ::= 'ASC' | 'DESC'
As with SQL, the ASC qualifier specifies that the ordering should be in ascending order, and is the default; likewise, the DESC qualifier specifies that the ordering should be in descending order.

7.1.2.9. Columns

The standard JCR-SQL2 grammar allows a query to include in the SELECT clause which property values should be returned and included in the results:
columns ::= (Column ',' {Column}) | '*'

Column ::= ([selectorName'.']propertyName ['AS' columnName]) |
           (selectorName'.*')

        /* If only one selector exists in this query,
           explicit specification of the selectorName
           is optional */

selectorName ::= Name
propertyName ::= Name
columnName ::= Name
When '*' "' is used for the list of selected columns, the result set is expected to minimally include, for each selector, a column for each single-valued non-residual property of the selector's node type, including those explicitly declared on the node type and those inherited from the node's supertypes.
For example, the result set for the following query would contain at least the [jcr:primaryType] column, since it is the only single-valued, non-residual property defined on the [nt:base] node type. The [jcr:mixinTypes] property is also non-residual, but the results need not include it since it is multi-valued.
SELECT * FROM [nt:base]
If there are multiple selectors, then SELECT * will include all of the selectable columns from each selector's node type. However, it is possible to request all of the selectable columns from some of the selectors, using the form. For example:
SELECT type1.*
FROM [acme:type1] AS type1
JOIN [acme:type2] as type2 ON type1.prop1 < type2.prop2
Note, however, that although only single-valued, non-residual properties are included when '*' "' is used in the SELECT clause, it is possible to explicitly include residual properties. For example, the following query finds all nodes that have at least one "foo" value for the acme:tagNames property:
SELECT [acme:tagNames] AS tagName
FROM [nt:base] WHERE tagName = 'foo'

7.1.2.10. Limit and Offset

Neither the standard JCR-SQL2 grammar or the JCR API itself provide support for limiting the rows that are returned in the results. This is a common need, especially for applications that paginate the results, where each page shows a subset of the results.
Because this is such an essential feature that cannot be accomplished any other way, the hierarchical database adds support for specifying the maximum number of rows to return, and optionally specifying the number of initial rows that should be skipped. This extension follows the SQL syntax:
Limit ::= 'LIMIT' count [ 'OFFSET' offset ]
count ::= /* Positive integer value */
offset ::= /* Non-negative integer value */
The LIMIT clause is entirely optional, and if absent does not limit the result set rows in any way. However, if the LIMIT count clause is used, then the result set will contain no more than count rows. This LIMIT clause may optionally be followed by the OFFSET number clause, where number is the number of initial rows that should be skipped before the rows are included in the results.

7.1.2.11. Psuedo-Columns

The design of the JCR-SQL2 query language makes fairly heavy use of functions, including SCORE() , NAME() , LOCALNAME() , and various constraints. The hierarchical database provides several more useful functions, such as PATH() and DEPTH() , that follow the same patterns.
However, these functions have several disadvantages. First, they make the JCR-SQL2 language less "SQL-like", since SQL-92 and -99 do not define similar kinds of functions. (There are aggregate functions, like COUNT , SUM , etc., but they operate on a particular column in all tuples and are therefore more dissimilar than similar.) This means that applications that use SQL and SQL-like query languages are less likely to be able to build and issue JCR-SQL2 queries.
A second disadvantage of these functions is that JCR-SQL2 does not allow them to be used within the SELECT clause. As a result, the location-related and score information cannot be included as columns of values in the QueryResult rows. Instead, a client can only access this information by obtaining the Node object(s) for each row. Relying upon both the result set and additional Java objects makes it difficult to use the JCR query system. It also makes certain kinds of applications impossible.
For example, the hierarchical database's JDBC driver is designed to enable JDBC-aware applications to query repository content using JCR-SQL2 queries. The standard JDBC API cannot expose the Node objects, so the only way to return the path-related and score information is through additional columns in the result. While such columns could always "magically" appear in the result set, doing this is not compatible with JDBC applications that dynamically build the SELECT clauses of queries based upon database metadata. Such applications require the columns to be properly described in database metadata, and the columns need to be used within queries.
The hierarchical database attempts to solve these issues by directly supporting a number of "pseudo-columns" within JCR-SQL2 queries, wherever columns can be used. These "pseudo-columns" include:
  • jcr:score is a column of type DOUBLE that represents the full-text search score of the node, which is a measure of the node's relevance to the full-text search expression. The hierarchical database does compute the scores for all queries, though the score for rows in queries that do not include a full-text search criteria may not be reliable.
  • jcr:path is a column of type PATH that represents the normalized path of a node, including same-name siblings. This is the same as what would be returned by the getPath() method of Node. Examples of paths include "/jcr:system" and "/foo/bar3".
  • jcr:name is a column of type NAME that represents the node name in its namespace-qualified form using namespace prefixes and excluding same-name-sibling indexes. Examples of node names include "jcr:system", "jcr:content", "ex:UserData", and "bar".
  • mode:localName is a column of type STRING that represents the local name of the node, which excludes the namespace prefix and same-name-sibling index. As an example, the local name of the "jcr:system" node is "system", while the local name of the "ex:UserData3" node is "UserData".
  • mode:depth is a column of type LONG that represents the depth of a node, which corresponds exactly to the number of path segments within the path. For example, the depth of the root node is 0, whereas the depth of the "/jcr:system/jcr:nodeTypes" node is 2.
All of these pseudo-columns can be used in the SELECT clause of any JCR-SQL2 query, and their use defines whether such columns appear in the result set. In fact, all of these pseudo-columns will be included when SELECT * clauses in JCR-SQL2 queries are expanded by the query engine. This means that every node type (even mixin node types that have no properties and are essentially markers) are represented by a queryable table with at least one column. However, unlike the older JCR-SQL query language, these pseudo-columns are never included in the result unless explicitly included or implicitly included with the SELECT * clause.

Note

Why did the hierarchical database use the jcr namespace prefix for some of the pseudo-columns, and mode for the others? The older JCR-SQL language defined the jcr:score, jcr:path, and jcr:name pseudo-columns, so we use the same names. The other columns were unique to the hierarchical database and are therefore defined with the mode namespace prefix.
Like any other column, all of these pseudo-columns can be also be used in the WHERE clause of any JCR-SQL2 query, even if they are not included in the SELECT clause. They can be used anywhere that a regular column can be used, including within constraints and dynamic operands. The hierarchical database will automatically rewrite queries that use pseudo-columns in the dynamic operands of constraints to use the corresponding function, such as SCORE() , PATH() , NAME() , LOCALNAME() , and DEPTH() . Additionally, any property existence constraint using these pseudo-columns will always evaluate to 'true' (and thus the hierarchical database's query optimizer will always remove such constraints from the query plan).
The jcr:path pseudo-column may also be used on both sides of an "equijoin" constraint clause. For example, equijoin expressions similar to:
 ... selector1.[jcr:path] = selector2.[jcr:path] ...
will be automatically rewritten by the hierarchical database's optimizer to the following form:
 ... ISSAMENODE(selector1,selector2) ...
As with regular columns, the pseudo-columns must be qualified with the selector name if the query contains more than one selector.

7.1.3. Full-text Search Grammar

The grammar for the full-text search expressions used in the JCR-SQL2's "full-text search constraint" is as follows:
FulltextSearch ::= Disjunct {Space 'OR' Space Disjunct}

Disjunct ::= Term {Space Term}

Term ::= ['-'] SimpleTerm

SimpleTerm ::= Word | '"' Word {Space Word} '"'

Word ::= NonSpaceChar {NonSpaceChar}

Space ::= SpaceChar {SpaceChar}

NonSpaceChar ::= Char - SpaceChar /* Any Char except SpaceChar */

SpaceChar ::= ' '

Char ::= /* Any character */
This grammar supports expressions similar to what you might provide to an Internet search engine. It lists the terms or phrases that should appear (or not appear) in the applicable property value(s). Simple terms consist of a single word (with only non-space characters), while phrases can be surrounded with double quotes.

7.1.4. Example JCR-SQL2 Queries

7.1.4.1. Simple Queries

One of the simplest JCR-SQL2 queries finds all nodes in the current workspace of the repository:
SELECT * FROM [nt:base]
This query will return a result set containing the jcr:primaryType column, since the nt:base node type defines only one single-valued, non-residual property called jcr:primaryType.

Note

The hierarchical database does not currently support returning multi-valued properties in result sets. This is permitted by the JCR 2.0 specification. The hierarchical database does, however, support using multi-valued properties in constraints and ORDER BY clauses.
Since our query used SELECT *, the hierarchical database also includes the five non-standard pseudo-columns mentioned above: jcr:path, jcr:score, jcr:name, mode:localName, and mode:depth. These columns are very convenient to have in the results, but also make certain criteria much easier than with the corresponding standard functions or those specific to the hierarchical database.
Queries can explicitly specify the columns that are to be returned in the results. The following query is very similar to the previous query and will return the same rows, but the result set will have only a single column and will not include any of the pseudo-columns:
SELECT [jcr:primaryType] FROM [nt:base]
The following query will return the same rows as in the previous two queries, but the SELECT clause explicitly includes only two of the pseudo-columns for the path and depth (which are computed from the nodes' locations):
SELECT [jcr:primaryType], [jcr:path], [mode:depth] FROM [nt:base]
In JCR-SQL2, a table representing a particular node type will have a column for each of the node type's property definitions, including those inherited from supertypes. For example, the nt:file node type, its nt:hierarchyNode supertype, and the mix:created mixin type are defined using the CND notation as follows:
[mix:created] mixin
  - jcr:created (date) protected
  - jcr:createdBy (string) protected

[nt:hierarchyNode] > mix:created abstract

[nt:file] > nt:hierarchyNode
  + jcr:content (nt:base) primary mandatory
Therefore, the table representing the nt:file node type will have 3 columns: the jcr:created and jcr:createdBy columns inherited from the mix:created mixin node type (via the nt:hierarchyNode node type), and the jcr:primaryType column inherited from the nt:base node type, which is the implicit supertype of the nt:hierarchyNode (and all node types).
The hierarchical database adheres to this behavior with the exception that a SELECT * will result in the additional pseudo-columns. Thus, this next query:
SELECT * FROM [nt:file]
is equivalent to this query:
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy],
       [jcr:path], [jcr:name], [jcr:score], [mode:localName], [mode:depth]
FROM [nt:file]

7.1.4.2. Using Columns in Constraints

Consider a query that selects some of the available columns from the nt:file table and uses a constraint to ensure the resulting file nodes have names that end in '.txt':
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy], [jcr:path] FROM [nt:file]
WHERE LOCALNAME() LIKE '%.txt'
The hierarchical database also supports placing criteria against the mode:localName pseudo-column instead of using the LOCALNAME() function. Such a query is equivalent to the previous query and will produce the exact same results:
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy], [jcr:path]
FROM [nt:file]
WHERE [mode:localName] LIKE '%.txt'

Note

The hierarchical database's pseudo-columns are often far easier to use than the corresponding function-like constraints.
Although this query looks much more like SQL, the use of the '[' and ']' characters to quote the identifiers is not typical of a SQL dialect. The hierarchical database actually supports the using double-quote characters and square braces interchangeably around identifiers (although they must match around any single identifier). Again, this next query, which looks remarkably like any SQL-92 or -99 dialect, is functionally identical to the previous two queries:
SELECT "jcr:primaryType", "jcr:created", "jcr:createdBy", "jcr:path" FROM "nt:file"
WHERE "mode:localName" LIKE '%.txt'

7.1.4.3. Inner Joins

In JCR-SQL2, a node will appear as a row in each table that corresponds to the node types defined by that node's primary type or mixin types, or any supertypes of these node types. In other words, a node will appear in the table corresponding to each node type for which Node.isNodeType(...) returns true.
For example, consider a node that has a primary type of nt:file but has an explicit mixin of mix:referenceable . This node will appear as a row in the all of these tables:
  • nt:file
  • mix:referenceable
  • nt:hierarchyNode
  • mix:created
  • nt:base
However, the columns in each of these tables will differ. The nt:file node type has the nt:hierarchyNode , mix:created , and nt:base for supertypes, and therefore the table for nt:file contains columns for the property definitions on all of these types. But because mix:referenceable is not a supertype of nt:file , the table for nt:file will not contain a jcr:uuid column. To obtain a single result set that contains columns for all the properties of our node, we need to perform an identity join .
The next query shows how to return all properties for nt:file nodes that are also mix:referenceable :
SELECT file.*, ref.*
FROM [nt:file] AS file
JOIN [mix:referenceable] AS ref
  ON ISSAMENODE(file,ref)
Since wildcards were used in the SELECT clause, the hierarchical database expands the SELECT clause to include the columns for all (explicit and inherited) property definitions of each type plus pseudo-columns for each type, which is equivalent to:
SELECT file.[jcr:primaryType],
       file.[jcr:created],
       file.[jcr:createdBy],
       file.[jcr:path],
       file.[jcr:name],
       file.[jcr:score],
       file.[mode:localName],
       file.[mode:depth],
       ref.[jcr:path],
       ref.[jcr:name],
       ref.[jcr:score],
       ref.[mode:localName],
       ref.[mode:depth],
       ref.[jcr:uuid]
FROM [nt:file] AS file
JOIN [mix:referenceable] AS ref
  ON ISSAMENODE(file,ref)
Note because we are using an identity join, the file.[jcr:path] column will contain the same value as the ref.[jcr:path].

Note

Fully-expand the SELECT clause to specify exactly the columns that you want, excluding the columns that return the same values or return values not needed by your application. This can also make the query a bit more efficient, since less data needs to be found and returned.
By the way, this is also what many well-written applications do when querying SQL databases.
Here is a query that does this by eliminating columns with duplicate values and using aliases that are simpler than the namespace-qualified names:
SELECT file.[jcr:primaryType] AS primaryType,
       file.[jcr:created] AS created,
       file.[jcr:createdBy] AS createdBy,
       ref.[jcr:uuid] AS uuid,
       file.[jcr:path] AS path,
       file.[jcr:name] AS name,
       file.[jcr:score] AS score,
       file.[mode:localName] AS localName,
       file.[mode:depth] AS depth
FROM [nt:file] AS file
JOIN [mix:referenceable] AS ref
  ON ISSAMENODE(file,ref)
Although this query looks much more like SQL, use of the '[' and ']' characters in JCR-SQL2 to quote the identifiers is not typical of a SQL dialect. Again, the hierarchical database supports the using double-quote characters and square braces interchangeably around identifiers (although they must match around any single identifier). This makes it easier for existing SQL-oriented tools and applications to work more readily, including applications that use the hierarchical database's JDBC driver to query a JCR repository.
This next query, which looks remarkably like any SQL-92 or -99 dialect, is functionally identical to the previous query. However, it uses double quotes and a pseudo-column identity constraint on jcr:path (which is identical in semantics and performance as the ISSAMENODE(...) constraint):
SELECT file."jcr:primaryType" AS primaryType,
       file."jcr:created" AS created,
       file."jcr:createdBy" AS createdBy,
       ref."jcr:uuid" AS uuid,
       file."jcr:path" AS path,
       file."jcr:name" AS name,
       file."jcr:score" AS score,
       file."mode:localName" AS localName,
       file."mode:depth" AS depth
FROM "nt:file" AS file
JOIN "mix:referenceable" AS ref
  ON file."jcr:path" = ref."jcr:path"

Note

When using joins and selecting multiple columns, use aliases on the columns to make it easier to reference those columns in constraints and ordering clauses.

7.1.4.4. Other Joins

These are examples of two-way inner joins, but the hierarchical database supports joining multiple tables together in a single query. The hierarchical database also supports a variety of joins, including:
  • INNER JOIN (or JOIN )
  • LEFT OUTER JOIN
  • RIGHT OUTER JOIN
  • FULL OUTER JOIN
  • CROSS JOIN

7.1.4.5. Set Operations

The hierarchical database also supports several other query features beyond JCR-SQL2. One of these is support for set queries that use:
  • UNION and UNION ALL
  • INTERSECT and INTERSECT ALL
  • EXCEPT and EXCEPT ALL .
Here is an example of a union:
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy], [jcr:path] FROM [nt:file]
UNION
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy], [jcr:path] FROM [nt:folder]

7.1.4.6. Subqueries

The hierarchical database also supports using (non-correlated) subqueries within the WHERE clause and wherever a static operand can be used. Subqueries can even be used within another subquery. All subqueries, though, should return a single column (all other columns will be ignored), and each row's single value will be treated as a literal value. If the subquery is used in a clause that expects a single row (e.g., in a comparison), only the subquery's first row will be used.
Subqueries in the hierarchical database are a powerful and easy way to use more complex criteria that is a function of the content in the repository, without having to resort to multiple queries and complex application logic, such as taking the results of one query and dynamically generating the criteria of another query.
Here's an example of a query that finds all nt:file nodes in the repository whose paths are referenced in the value of the vdb:originalFile property of the vdb:virtualDatabase nodes. (This query also uses the $maxVersion variable in the subquery.)
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy], [jcr:path]
FROM [nt:file]
WHERE PATH() IN (
   SELECT [vdb:originalFile] FROM [vdb:virtualDatabase]
   WHERE [vdb:version] <= $maxVersion
     AND CONTAINS([vdb:description],'xml OR xml maybe')
)
Without subqueries, this query would need to be broken into two separate queries: the first would find all of the paths referenced by the vdb:virtualDatabase nodes matching the version and description criteria, followed by one (or more) subsequent queries to find the nt:file nodes with the paths expressed as literal values (or variables).

Note

Using a subquery is not only easier to implement and understand, it is actually more efficient.