Advanced techniques

Even more complex queries

The following query (notice the question mark after p.Manager)
select p, p.Name, p.Budget, m.Name
from Project p, p.Manager? m
where m.Name like ?
order by p.Budget desc 


denotes an outer join : it will return all projects, event those without a project manager. Whereas the inner join version (that without the question mark) that we previously came across with will return only those projects which have a project manager.

Revisiting the average age by project example

Suppose you want to display the average age for each project.
IObjectContext ctx5 = CreateContext();

foreach (Project p in ctx5.Collect<Project>()
{
    float totalAge = 0;
    foreach (Employee e in p.Members)
	totalAge += DateTime.Now - e.BirthDate;
    float average = totalAge / p.Members.Count;
    // TODO : Display the average for project p
}


Look at the generated SQL:
  • there is one SQL query for the initial ctx5.Collect statement
  • one query (or two queries, depending on how the mapping is done) for each p.Members statement (not for each Employee in p.Members, the framework is smart enough to optimize this)

The number of SQL statements is directly a function of the number of projects in the database. Ideally, only two queries should suffice:
  • one SQL query for the initial ctx5.Collect statement
  • one query for all the Employee objects that will be used inside the loop

This is precisely what the Preload is about : minimizing the number of queries by loading ahead of time the relevant data. Our example can thus be rewritten this way :
IObjectContext ctx5 = CreateContext();
ctx5.Preload("Project p, p.Members? e");

foreach (Project p in ctx5.LocalStorage.Collect<Project>()
{
     // content of the loop unchanged ; but without n+1 select behaviour
}


The Preload statement instructs the ObjectContext to load its LocalStorage with all the Project entities and with all the Employee entities that take part in a Project. Recall that the question mark indicates that the join is an outer join (so that even Projects without Members are preloaded).

The Collect method is now executed directly on the LocalStorage since we know that all the Projects we are interested in have been preloaded.
The p.Members statement no longer generates an SQL query. The LocalStorage has marked this relation as being already loaded.
The total gain is that the number of SQL queries is now exactly two, regardless of the number of Project entities.

More preloading options

Suppose that a processing starts by preloading the Project entities, then later wants to preload the Employee entities that are Members of a Project as before. This scenario is illustrated by the following code snippet :
IObjectContext ctx5 = CreateContext();
ctx5.Preload("Project p);

// some initial processing
ctx5.Preload("Project p, p.Members? e");

foreach (Project p in ctx5.LocalStorage.Collect<Project>())
{
    // content of the loop unchanged
}

In this case, the second Preload statement would generate a useless query to preload the Project entities whereas we know for sure that all the entities of interset are already in the LocalStorage. In this case, use the following syntax
ctx5.Preload("(Project p), p.Members e");


to tell not to preload the Project entities (thoug the p variable is useful to qualify the Members reference)
Incidentally, the outer join is no longer useful in this context since we are interested in Members only, not in the association Project – Members.

Event more preloading options

The syntax of Preload statements also supports “where clauses” so that only a subset of all entites are preloaded, such as in
ctx5.Preload("(Project p), (p.Manager m), p.Members e where m.Name like ?", "J%");


This statement preloads all those members that take part in a project with a known project manager whose name starts with “J”.

Querying the RemoteSource, the LocalStorage or the ObjectContext ?

The RemoteSource, the LocalStorage and the ObjectContext posess a set of similar methods with identical name but with a slightly varying semantics. This table sums up what each method does in which context and when to use each.

// TODO (by order of increasing functionality, but decreasing performance!)
// On RemoteSource, only the database is queried, regardless of the data in the LocalStorage
// On LocalStorage, only the local cache is queried, regardless of the data in the database
// On ObjectContext, a Preload is first executed to update the data in the LocalStorage, then the query is performed on
// the (freshly fed) LocalStorage.

Last edited Sep 6, 2007 at 8:14 AM by ysaule, version 3

Comments

No comments yet.