This overview shows which query operations perform well or badly on large datasets. It should give you an idea which operations can be used on large datasets and which operations can only be applied to small datasets.
Native queries are translated to SODA and therefore they share the same basic performance characteristics.
For a good query performance fields which are used in a query have to be indexed. Otherwise db4o needs to scan through all objects. With an index these operations should scale logarithmically with the amount of data. The following queries all assume that the fields are indexed.
Simple equals operations on indexed fields’ perform very well.
final String criteria = Item.dataString(rnd.nextInt(NUMBER_OF_ITEMS)); final List<Item> result = container.query(new Predicate<Item>() { @Override public boolean match(Item o) { return o.getIndexedString().equals(criteria); } });
Not equals operations also do perform well. However a 'not equals' operation tends to return a large result which will slow down the query.
final String criteria = Item.dataString(rnd.nextInt(NUMBER_OF_ITEMS)); final List<Item> result = container.query(new Predicate<Item>() { @Override public boolean match(Item o) { return !o.getIndexedString().equals(criteria); } });
Queries which navigate along references are executed also efficiently, as long every field and reference is indexed.
However there's a catch to this: The reference field type has to be a concrete type. If a field type is a generic type, an interface or an object-type, then the query runs slow.
final String criteria = Item.dataString(rnd.nextInt(NUMBER_OF_ITEMS)); final List<ItemHolder> result = container.query(new Predicate<ItemHolder>() { @Override public boolean match(ItemHolder o) { return o.getIndexedReference().getIndexedString().equals(criteria); } });
Like regular equals operation, comparisons against references also have a good performance.
final Item item = loadItemFromDatabase(); final List<ItemHolder> result = container.query(new Predicate<ItemHolder>() { @Override public boolean match(ItemHolder o) { return o.getIndexedReference()==item; } });
Comparison and range queries also perform well.
final List<Item> result = container.query(new Predicate<Item>() { @Override public boolean match(Item o) { return o.getIndexNumber()>criteria; } });
final List<Item> result = container.query(new Predicate<Item>() { @Override public boolean match(Item o) { return o.getIndexNumber()>biggerThanThis && o.getIndexNumber() <smallerThanThis; } });
Simple equals operations on dates are fast. However complex date comparisons are not yet optimized and run extremely slowly. For those you can fallback to SODA.
final List<Item> result = container.query(new Predicate<Item>() { @Override public boolean match(Item o) { return o.getIndexDate().equals(date); } });
Here's an overview of the query operations with bad performances characteristics. The reason is that db4o cannot utilize indexes to perform these queries. Or that the native query optimizer cannot translate the query to SODA. That means the query time grows linearly with the amount of data.
When your query navigates across a getter which type is a generic parameter, an object or interface then the performance is bad. This is a limitation of the current query system implementation.
// The type of the 'indexedReference' is the generic parameter 'T'. // Due to type type erasure that type is unknown to db4o final List<GenericItemHolder<Item>> result = container.query(new Predicate<GenericItemHolder<Item>>() { @Override public boolean match(GenericItemHolder<Item> o) { return o.getIndexedReference().getIndexedString().equals(criteria); } });
All string operations beside the simple equals operation cannot use indexes at the moment. Therefore all string operations like contains, like, starts with etc. run slowly. Advanced string operations are not translated to SODA and therefore run even more slowly.
final List<Item> result = container.query(new Predicate<Item>() { @Override public boolean match(Item o) { return o.getIndexedString().contains(criteria); } });
The native query optimizer doesn't recognize date comparison and therefore such queries run extremely slow. You should fall back to SODA for date queries.
final List<Item> result = container.query(new Predicate<Item>() { @Override public boolean match(Item o) { return o.getIndexDate().after(date); } });
Any query which does contains operations on collections/arrays or navigates across a collection/array field will run slowly. The reason is that db4o cannot index collections. Furthermore the native query optimizer may doesn't recognize such a query and just loads all objects to process the query.
final List<CollectionHolder> result = container.query(new Predicate<CollectionHolder>() { @Override public boolean match(CollectionHolder o) { return o.getItems().contains(item); } });
When you do a computation in a query expression, then the native query optimizer cannot optimize your query. In that case it will load all objects in order to execute your query.
final List<Item> result = container.query(new Predicate<Item>() { @Override public boolean match(Item o) { return o.getIndexedString().equals("data for " + number); } });
Therefore you should move any computation outside of the query. Like this:
final String criteria = "data for " + number; final List<Item> result = container.query(new Predicate<Item>() { @Override public boolean match(Item o) { return o.getIndexedString().equals(criteria); } });
Calling complex methods in native queries is a bad idea. Most of the time the native query optimizer cannot deal with a complex method and will load all objects to execute the query.
final List<Item> result = container.query(new Predicate<Item>() { @Override public boolean match(Item o) { return o.complexMethod(); } });
The best indication that a query is slow is when it cannot use any field index. Install a diagnostic listener and look for the LoadedFromClassIndex message. That message indicates that a query couldn't use any field index for its execution.
For native queries another indication is when the 'NativeQueryNotOptimized' or the 'NativeQueryOptimizerNotLoaded' diagnostic message occurs. Watch out for those as well.