Have Code, Will Travel

December 1, 2009

Dynamic Search Objects Part 6—the Search Service

Filed under: Dynamic SQL — Phil Sandler @ 3:25 pm
Tags:

If you’ve read this series up to this point, you pretty much have everything you need to implement dynamic queries in your own application.  Just a few notes to wrap up the series and I will get on to other topics.

Creating a Search Service Facade

Depending on your requirements, it may be perfectly reasonable to create a tight coupling between DTO and search implementation.  In cases where the return object of the search is strongly typed (i.e. not a dataset/datatable), this actually may be a necessity.  However, there are a number of reasons why creating a simpler “catch-all” service interface might be desirable.  You may want to actually develop the parameter/DTO implementation and query implementation separately, so your UI developer can write their code before the query has been written.  You may simply want to hide the complexity of the Data Access Layer from other layers of the application.  Or your search service may actually be part of a “real” service layer (SOA). 

In any case, you may want to utilize a simple service facade for search execution that will take in any of your DTO types and determine the correct type of query to perform.  The basic idea is that the consumer using the search service only needs knowledge of the parameter/DTO object and the service interface itself.  Any coupling between the DTO and the query implementation exists only behind the service facade.  So the developer calling the service needs only to set the desired values on the DTO and call SearchService.ExecuteSearch(myDTO).

A simple, first-pass approach might look something like this:

public class SearchServiceFacade
{
    public SearchQueryResultV2 ExecuteSearch(SearchParamBaseV2 param)
    {
        if (param is OrderSearchParamV4)
        {
            var service = new OrderSearchServiceV5();
            return service.SearchOrders((OrderSearchParamV4) param);
        }
        if (param is CustomerSearchParam)
        {
            var service = new CustomerSearchService();
            return service.SearchCustomers();
        }
        else
        {
            throw new NotSupportedException();
        }
    }
}

 

This looks a little ugly because I didn’t use an interface for my service (or DTO for that matter).  A more permanent implementation might use a factory to select the query, and could look something like this (consider this pseudo-code!):

public class SearchServiceFascade : IServiceFascade
{
    private readonly ISearchServiceFactory _serviceFactory;

    public SearchServiceFascade(ISearchServiceFactory serviceFactory)
    {
        if (serviceFactory == null) throw new ArgumentNullException("serviceFactory");
        _serviceFactory = serviceFactory;
    }

    public SearchQueryResultV2 ExecuteSearch(ISearchParam param)
    {
        ISearchService service = _serviceFactory.GetService(param);
        return service.PerformSearch(param);
    }
}

 

Conclusion

Since I started using this technique in ~2005, I have implemented it on at least five systems with consistent success.  The key to that success?  Basing the design of the framework on the specific needs of each project. 

  • In one of the implementations I worked on, virtually every query against the database needed to be filtered by client Id, so it made sense to create an abstract base query class, with an abstract method FilterByClientId which had to be overridden in each derived class. 
  • In another project the requirement to filter on multiple values in the same field (with different comparison operators) came late in the development cycle, and we had to scramble to update the framework to support this. 
  • In a third project, the query specification needed to be persisted to the database, so the DTO was actually a representation of a set of parameter values that could be used repeatedly to formulate the same query (this was for a reporting system).

Because this technique employs OOP concepts like encapsulation and loose coupling, responding to changing requirements is made much easier, and some measure of flexibility is built right in.

I’m not sure of the EXACT topic of my next post yet, but I have done a lot of work with an Inversion of Control (IOC) framework in the last year, and I have come to believe that the benefits of using these frameworks are huge.  So my next post(s) will share some code and/or thoughts around utilizing an IOC container.

Shout it

November 20, 2009

Dynamic Search Objects Part 5—Sorting

Filed under: Dynamic SQL — Phil Sandler @ 2:23 pm
Tags:

In my last post, we added paging back to our original requirements, leaving only sorting left to implement.  So since we are covering that topic in today’s post, we can re-add it, which gets us back to all the functionality we had initially envisioned (with some minor clarifications to the original version):

Show all orders, filtering by an optional list of customers [using Company Name] and an optional list of employees who took the order and within an optional date range.  By default, the orders should be sorted by date [Ascending], then customer name [Ascending], but can be sorted by any single or pair [or any number] of columns in ascending or descending order.  The results should show a page of x records at a time, where x can be selected by the client.  The client should also be able to request and receive a specific page of data by entering the desired page number.

Like paging, sorting is built right in to the LLBLGen Dynamic List functionality.  So writing the lines of code that actually generate the ORDER BY clause is pretty straightforward.  However, figuring out how to fit a dynamically selected sort order into our framework presents a few design problems that we need to address.

Sorting is a PITA

As noted in Part 1, Dynamic SQL at least allows us to avoid one sorting challenge—using (or generating) a SQL CASE statement and casting the result to a common type.  But whether you are using static SQL (a stored procedure) or Dynamic SQL (as we are in this article), a common problem remains: providing the consumer the ability to effectively express the intended sort order without exposing the inner workings of the query.

Our query (and any query) returns a fixed set fields in the result set.  The client needs the ability to communicate to our service (via the DTO) the fields that should participate in the sort order, the rank of each field (first sort by fieldX, then by fieldY), and the operator to use for each (ASC or DESC).  So how do we express this in the DTO?  Remember that while the query service does (and must) have knowledge of the the DTO, the DTO has no knowledge of the query service, so there is no guarantee that the DTO’s fields match with the fields in the service. 

There are a number of ways we can handle this, but none of them are perfect.  Most of the time we end up simply managing sort fields by ordinal number.  So in our query, if a client wants to order the data by company name ascending, then order date descending, the code that sets values on the DTO will need to indicate that we want “field 3 ascending” then “field 1 descending” (3 and 1 are the ordinal numbers that represent the positions of company name and order date, respectively).  This convention needs to be well understood by the developer who is writing the code to add these values to the DTO, as it can get confusing when, for example, a grid in the UI puts fields in a different order than the way they come back in the result set.

Wait, Can’t We Just Sort on the Client Side?

Deferring sorting to the client side (e.g. jQuery, Dataset/Dataview sorting) may look tempting.  And it actually is a viable option for paging, sorting and filtering, if (and only if):

  1. the set of data is small, and
  2. there is little concern about performance and scalability

From my experience, the systems that meet this kind of criteria are few and far between.  Note that it’s not possible to mix and match sorting/paging in the application/database—if you ask the database for an unsorted 100 rows of data, then sort those rows on the application side, you’re likely not going to get the set of data you were expecting.  This may seem obvious, but I’ve seen the mistake made enough times that I wanted to at least mention it.

It is much more efficient to sort and filter in the database for a number of reasons.  For one thing, database engines are highly optimized for doing exactly the kind of work that sorting and filtering entail; this is what their underlying code was designed to do.  But even barring that—even assuming you could write code that could match the kind of sorting, filtering and paging performance of a mature database engine—it’s still preferable to do this work in the database, for the simple reason that it’s more efficient to limit the amount of data that is transferred from the database to the application server. 

So for example, if you have 10,000 rows before filtering, and your query pares that number down to 75, filtering on the client results in the data from all 10,000 rows being passed over the wire (and into your app server’s memory), where filtering on the database side would result in only the filtered 75 rows being moved between database and application.  This can make a huge impact on performance and scalability.

Sorting is a PITA 2: Revenge of the RDBMS

One last note about using sorting and paging in conjunction.  A query that implements paging can have odd results if the ORDER BY clause does not include a field that represents an empirical sequence in the data; sort order is not guaranteed beyond what is explicitly specified in the ORDER BY clause in most (maybe all) database engines.   An example: if you have 100 orders that all occurred on the exact same date, and you ask for the first page of this data sorted by this date, then ask for the second page of data sorted the same way, it is entirely possible that you will get some of the data duplicated across both pages.  So depending on the query and the distribution of data that is “sortable,” it can be a good practice to always include a unique field (like a primary key) as the final field in a sort clause if you are implementing paging.

Have I mentioned that sorting is a PITA?

Our Approach

Now that I’ve explained my views on Life, the Universe, and Sorting in 5000 words or less, let’s get to the ~20 lines of code that need to be added to implement sorting in our framework.

Let’s create an enum to represent sort direction (ascending or descending).  LLBLGen Pro actually provides a similar type out of the box, but for the purpose of this article, I don’t want knowledge of the LLBLGen framework exposed above the service layer (to be clear, this is not a restriction that I normally enforce):

public enum SortDirection
{
    Ascending,
    Descending
}

Next we need a class to represent the combination of (ordinal) field and the sort direction:

public class SortField
{
    private readonly int _fieldPosition;
    private readonly SortDirection _direction;

    public SortField(int fieldPosition, SortDirection op)
    {
        _fieldPosition = fieldPosition;
        _direction = op;
    }

    public SortDirection Operator
    {
        get { return _direction; }
    }

    public int FieldPosition
    {
        get { return _fieldPosition; }
    }
}

Since we will provide the ability to sort by multiple fields, we need a list of our SortField class on our DTO.  Since sorting will likely be supported on every implementation, we’ll add it to the base class.  I’m only showing code that changed in the base class here—grab the source code if you want to see it in context:

private readonly List<SortField> _sortFields = new List<SortField>();

public void AddSortField(SortField sort)
{
    _sortFields.Add(sort);
}

public List<SortField> SortFields
{
    get { return _sortFields; }
}

Our Dynamic List wrapper needs to change in a few ways.  We need to create a private method to transform our list of sort fields into a sort expression:

private SortExpression CreateSortExpression(List<SortField> sortFields)
{
    if (sortFields == null || sortFields.Count == 0) return null;

    var existingFields = new HashSet<int>();

    var sort = new SortExpression();
    foreach(SortField field in sortFields)
    {
        //this prevents the same field from being added twice
        if (existingFields.Contains(field.FieldPosition)) continue;
        
        existingFields.Add(field.FieldPosition);

        SortOperator op = field.Operator == SortDirection.Ascending ? SortOperator.Ascending : SortOperator.Descending;
        var sortClause = new SortClause(_fields[field.FieldPosition], null, op);
        sort.Add(sortClause);
    }

    return sort;
}

Now our main execute method needs to change to accept the list of sort fields, hand it to our new new method, and use the resulting sort expression in our database call:

public SearchQueryResultV2 ExecuteQuery(int pageNumber, int pageSize, List<SortField> sortFields)
{
    SortExpression sort = CreateSortExpression(sortFields);
    int dbCount;
    var dt = new DataTable();
    using (var adapter = new DataAccessAdapter())
    {
        //fetch the data table
        adapter.FetchTypedList(_fields, dt, _bucket, 0, sort, false, _groupBy, pageNumber, pageSize);
        //perform an additional query to get the total count of rows that matched
        dbCount = adapter.GetDbCount(_fields, _bucket, _groupBy, false);
    }

    return new SearchQueryResultV2(dt, dbCount, pageNumber, pageSize);
}

Finally, we need to implement the default sort order from our requirements if no order was specified, and call the new version of the ExecuteQuery method.  Note that we also add the PK to the sort order to ensure that our sequence is empirically maintained between pages of data (see above):

//handle default sort
if (param.SortFields.Count == 0)
{
    param.AddSortField(new SortField(1, SortDirection.Ascending));
    param.AddSortField(new SortField(3, SortDirection.Ascending));
}

//add PK to the end of the sort clause.
param.AddSortField(new SortField(0, SortDirection.Ascending));

return listManager.ExecuteQuery(param.PageNumber, param.PageSize, param.SortFields);

That should do it for sorting, and that should be all you need to get up and running with a single implementation of a Dynamic Search Object.  In my next post (which I promise will be brief!), I’ll talk a little bit about creating a clean service interface for situations when you have many implementations.

Source Code for Part 5

Shout it

November 17, 2009

Dynamic Search Objects Part 4—Paging Through Records

Filed under: Dynamic SQL — Phil Sandler @ 4:54 pm
Tags:

In my last post, we revised our requirements to read:

Show all orders, filtering by an optional list of customers and an optional list of employees who took the order and within an optional date range.

While this took us a step closer to our original requirements (see Part 1), we are still missing a few features–paging and sorting.  We’ll cover paging in this installment, and save sorting for next time.  So for now our requirements can be amended to add:

The results should show a page of x records at a time, where x can be selected by the client.  The client should also be able to request and receive a specific page of data by entering the desired page number.

Our framework is based on LLBLGen Pro’s powerful dynamic list functionality, which already supports paging, so adding this flexibility is very straightforward.  We already have a class that encapsulates our search criteria (the parameter or DTO class), so all we really need to is modify it with a few additional fields, then use those fields at the appropriate time in our query service.

Violating YAGNI

I’m going to violate YAGNI slightly by creating a base class for our DTO which will store and expose paging criteria.  Why?  Because I know for a fact that I’m going to be writing more than one of these dynamic searches, and I’d rather not re-create the same functionality in each implementation.

The only change to our DTO/parameter class (not shown) is that it now inherits from a base class:

public abstract class SearchParamBaseV1
    {
        private int _pageNumber = 0;
        private int _pageSize = 0;
        
        public void SetPaging(int pageSize, int pageNumber)
        {
            _pageNumber = pageNumber;
            _pageSize = pageSize;
        }

        public int PageNumber
        {
            get { return _pageNumber; }
        }

        public int PageSize
        {
            get { return _pageSize; }
        }
    }

Our DynamicListWrapper class only needed two changes: the ExcecuteQuery signature and the first database call.  The changed lines:

public SearchQueryResultV2 ExecuteQuery(int pageNumber, int pageSize)

adapter.FetchTypedList(_fields, dt, _bucket, 0, null, false, _groupBy, pageNumber, pageSize);

Our search Service class didn’t need to change much–the call to the execute method now passes the paging information from the DTO object:

return listManager.ExecuteQuery(param.PageNumber, param.PageSize);

I also fixed a bug in the service class that I came across when I implemented paging (glad I decided to make writing tests part of my process!).  I found an explanation of why specifying the type is necessary here.  The code in question before:

listManager.AddField(new EntityField2("EmployeeName",
                                      EmployeesFields.FirstName + " " +
                                      EmployeesFields.LastName), false);

. . . and after:

listManager.AddField(new EntityField2("EmployeeName",
                                      EmployeesFields.FirstName + " " +
                                      EmployeesFields.LastName, typeof(string)), false);

I’m also going to modify the SearchQueryResult class to include the paging information.  This is useful in scenarios where the consumer of the result (e.g.the view in MVC) needs this information but has no knowledge of the the DTO or its values.

public class SearchQueryResultV2
    {
        public int TotalRows { get; private set; }
        public DataTable Data { get; private set; }
        public int PageNumber { get; private set; }
        public int PageSize { get; private set; }

        public SearchQueryResultV2(DataTable data, int totalRows, int pageNumber, int pageSize)
        {
            if (data == null) throw new ArgumentNullException("data");
            Data = data;
            TotalRows = totalRows;
            PageSize = pageSize;
            PageNumber = pageNumber;
        }
    }

In my next post, we’ll address the dreaded topic of sorting our results (cue foreboding music).

Source Code for Part 4

Shout it

November 16, 2009

Dynamic Search Objects Part 3—Filtering and the Predicate Helper

Filed under: Dynamic SQL — Phil Sandler @ 4:54 pm
Tags:

In my last post, we created a Dynamic List Wrapper to help facilitate generating dynamic SQL.  Our wrapper exposed the PredicateExpression property of LLBLGen’s RelationPredicateBucket class, which is essentially the mechanism by which a developer can add to the WHERE clause of the final query.  In this installment, we will create a helper class that will ease transforming our filter criteria into an LLBL Predicate Expression.

How Far Should We Take This?

I have implemented this pattern in a number of systems to date, and the widest range of difference between the implementations is, without question, the level of flexibility/granularity in the filtering system.  There are a lot of possible options that can really amp up the power and utility of your framework:

  • Support for all comparison operators: =, !=, >, <, <=, >=
  • Support for all string comparisons: Equal, StartsWith, EndsWith, Contains
  • Support for aggregate filters (HAVING clause)
  • Support for combining filters with AND or OR (“OrderDate > x AND OrderDate <= y” )
  • Support for subquery filters (“WHERE CustomerId IN (SELECT CustomerId FROM . . .)”)

At the same time, implementing all these features increases complexity and adds time to unit and system testing.  For example, if you implement the ability to filter on multiple values on a string field AND choose the string comparison used for each value, your parameter object will need to express both the value and the operator to use, otherwise the query implementation will have no way to interpret it (you might use a dictionary or more complex structure to accomplish this).  This will result in more code paths, and many more cases to test against known data to ensure that your code is working properly.

Once again, the overall requirements of your system should dictate how much of this flexibility really needs to be built.

Keeping it Simple (at Least for Now)

We don’t need to get that granular for this article, so we’re going to define a few conventions:

  • Multiple filter values specified within a list will use “OR“
  • All other filters will use “AND”
  • String comparisons will effectively be “Contains” comparisons (“LIKE ‘%"’ + [value] + ‘%’”)
  • Date comparisons supported will be “greater than or equal” and “less than or equal”
  • All other comparisons will use the Equals operator.

Before we get to the code, let’s add back some of the requirements that we pared down in Part 1.  We limited our requirements to:

Show all orders, optionally filtered by any combination of: a customer, occurring after start date, occurring before an end date.

Let’s modify that to:

Show all orders, filtering by an optional list of customers and an optional list of employees who took the order and within an optional date range.

Previous code examples have shown the customer filter to use the CustomerId field.  I’m going to change that slightly here to demonstrate the string comparison functionality, which really wouldn’t make a lot of sense with an ID field (which would likely use an exact match, as criteria would be picked from a list).  I’m going to switch this to filter against the CompanyName field, which a user might more realistically want to search using only part of the name.

The Code

Here is the code for the predicate helper.  Since our requirements are pretty simple, there is no reason to break this into smaller classes at this point:

internal class PredicateHelper
    {
        internal PredicateExpression CreateStringFilter(IList<string> values, IEntityField2 fieldToFilter)
        {
            var pred = new PredicateExpression();
            foreach(string value in values)
            {
                pred.AddWithOr(new FieldLikePredicate(fieldToFilter, null, "%" + value + "%"));
            }
            return pred;
        }

        internal IPredicate CreateIntFilter(List<int> values, EntityField2 fieldToFilter)
        {
            var pred = new PredicateExpression();
            foreach (int value in values)
            {
                pred.AddWithOr(new FieldCompareValuePredicate(fieldToFilter, null,
                    ComparisonOperator.Equal, value));
            }
            return pred;
        }

        internal PredicateExpression CreateDateGreaterEqualFilter(DateTime? value, IEntityField2 fieldToFilter)
        {
            var pred = new PredicateExpression();
            if (value.HasValue)
            {
                pred.Add(new FieldCompareValuePredicate(fieldToFilter, null,
                    ComparisonOperator.GreaterEqual, value));
            }
            return pred;
        }

        internal PredicateExpression CreateDateLessEqualFilter(DateTime? value, IEntityField2 fieldToFilter)
        {
            var pred = new PredicateExpression();
            if (value.HasValue)
            {
                pred.Add(new FieldCompareValuePredicate(fieldToFilter, null,
                    ComparisonOperator.LessEqual, value));
            }
            return pred;
        }
    }

The relevant snippet of the search service V3 (the full version is included in the source code link):

var predicateHelper = new PredicateHelper();
listManager.Filter.Add(predicateHelper.CreateStringFilter(param.CompanyName, CustomersFields.CompanyName));
listManager.Filter.Add(predicateHelper.CreateDateGreaterEqualFilter(param.StartDate, OrdersFields.OrderDate));
listManager.Filter.Add(predicateHelper.CreateDateLessEqualFilter(param.EndDate, OrdersFields.OrderDate));
listManager.Filter.Add(predicateHelper.CreateIntFilter(param.EmployeeId, OrdersFields.EmployeeId));

In my next post, we’ll add paging back to our requirements, and modify our framework to support it.

Source Code for Part 3

Shout it

November 12, 2009

Dynamic Search Objects Part 2—Dynamic List Wrapper

Filed under: Dynamic SQL — Phil Sandler @ 9:52 pm
Tags:

In my last post, I demonstrated a concrete implementation of a dynamic query, not very far removed from the basic functionality that ships with LLBLGen Pro.  In this installment, we’ll create a convenience wrapper that will generally make creating dynamic lists easier, and further facilitate rapidly moving from a set of requirements for a search to a working implementation.

The wrapper is essentially a stateful class that combines the structures that make up a Dynamic List, and provides a convenient way to put those structures together and then execute the query.

This class will be responsible for:

  • Allowing fields to be created
  • Managing the GroupBy collection via field creation
  • Allowing relations to be added
  • Allowing filters to be added
  • Allowing sort expressions to be added (future post)
  • Add paging (future post)
  • Managing the state of all of the above structures before query execution
  • Executing the query and returning a SearchQueryResult

The exact details of the wrapper should be tailored to the overall requirements of your system.  For example, if none of your search results require any kind of aggregation, the GroupBy clause functionality isn’t necessary.

Note that the responsibility of defining fields, adding relations, creating aggregates, etc. are still the responsibility of the query implementation.  The core of the pattern is to have one query implementation per parameter implementation; there is no getting around having to build classes that have specific knowledge about the fields and entities involved in the query.  Our goal is to make each implementation as simple as possible.

Version 1 of our Dynamic List Wrapper:

public class DynamicListWrapperV1
    {
        private readonly RelationPredicateBucket _bucket = new RelationPredicateBucket();
        private ResultsetFields _fields;
        private GroupByCollection _groupBy;

        public void AddRelation(IRelation relation)
        {
            _bucket.Relations.Add(relation);
        }

        public IPredicateExpression Filter
        {
            get { return _bucket.PredicateExpression; }   
        }

        public void AddField(IEntityField2 fieldToAdd, bool addToGroupBy)
        {
            AddField(fieldToAdd, addToGroupBy, fieldToAdd.Name);
        }

        public void AddField(IEntityField2 fieldToAdd, bool addToGroupBy, string fieldName)
        {
            //if this is the first field, instantiate
            if (_fields == null) _fields = new ResultsetFields(1);
            else _fields.Expand(1);
            
            _fields.DefineField(fieldToAdd, _fields.Count – 1, fieldName);
            
            if (addToGroupBy)
            {
                if (_groupBy == null) _groupBy = new GroupByCollection();
                _groupBy.Add(fieldToAdd);
            }
        }

        public SearchQueryResult ExecuteQuery()
        {
            int dbCount;
            var dt = new DataTable();
            using (var adapter = new DataAccessAdapter())
            {
                //fetch the data table
                adapter.FetchTypedList(_fields, dt, _bucket, 0, null, false, _groupBy);
                //perform an additional query to get the total count of rows that matched
                dbCount = adapter.GetDbCount(_fields, _bucket, _groupBy, false);
            }

            return new SearchQueryResult(dt, dbCount);
        }
    }

Our updated Order Search Service:

public class OrderSearchServiceV2
    {
        public SearchQueryResult SearchOrders(OrderSearchParamV1 param)
        {
            var listManager = new DynamicListWrapperV1();

            //create the fields.  This is roughly the equivalent of the SELECT statement
            listManager.AddField(OrdersFields.OrderId, true);
            listManager.AddField(OrdersFields.OrderDate, true);
            listManager.AddField(OrdersFields.ShippedDate, true);
            listManager.AddField(CustomersFields.CompanyName, true);
            listManager.AddField(EmployeesFields.FirstName, true);
            listManager.AddField(EmployeesFields.LastName, true);

            listManager.AddField(new EntityField2("EmployeeName",
                                                  EmployeesFields.FirstName + " " +
                                                  EmployeesFields.LastName), false);

            var countField = OrderDetailsFields.ProductId;
            countField.AggregateFunctionToApply = AggregateFunction.Count;
            listManager.AddField(countField, false, "OrderDetailCount");
            
            //setup relations.  This is roughly the equivalent of the FROM and JOIN statements
            listManager.AddRelation(OrdersEntity.Relations.CustomersEntityUsingCustomerId);
            listManager.AddRelation(OrdersEntity.Relations.EmployeesEntityUsingEmployeeId);
            listManager.AddRelation(OrdersEntity.Relations.OrderDetailsEntityUsingOrderId);

            //filter by customerId if it was provided
            if (param.CustomerId != null)
            {
                listManager.Filter.Add(OrdersFields.CustomerId == param.CustomerId);
            }

            //filter by StartDate if it was provided
            if (param.StartDate.HasValue)
            {
                listManager.Filter.Add(OrdersFields.OrderDate >= param.StartDate);
            }

            //filter by EndDate if it was provided
            if (param.EndDate.HasValue)
            {
                listManager.Filter.Add(OrdersFields.OrderDate <= param.EndDate);
            }

            return listManager.ExecuteQuery();
        }
    }

This is a pretty good start, and is an improvement over what we put together in Part 1.  However, we’re still dealing with a pared-down set of requirements which really don’t necessitate a dynamic query.  In my next post, we’ll add back some of these requirements, modify our parameter object to support more flexible filtering, and create a helper class to simplify creating those filters.

Source Code for Part 2

Shout it kick it on DotNetKicks.com

November 10, 2009

Dynamic Search Objects Part 1–Introduction

Filed under: Dynamic SQL,Uncategorized — Phil Sandler @ 8:51 pm
Tags: , ,

A problem that is common to almost any database application is the need to search the database for specific information, based on a set of criteria provided either by a user/UI or another process.  When the requirements are simple, for example:

Show all orders for a specific customer that occurred during a specific period of time.

the problem is easily solvable using a static query/stored procedure:

SELECT * FROM orders WHERE CustomerID = @customer_id AND OrderDate Between @start_date AND @end_date

However, the problem becomes much more complex when the requirements begin to include things like:
  • Multiple criteria for a specific field
  • Optional criteria
  • Optional sorting, especially with reversible field sorting (ascending and descending) and the ability to sort by multiple fields
  • Paging (especially in conjunction with sorting, since paging REQUIRES server-side sorting)

Show all orders, filtering by an optional list of customers and an optional list of employees who took the order and within an optional date range.  By default, the orders should be sorted by date, then customer name, but can be sorted by any single or pair of columns in ascending or descending order.  The results should show a page of x records at a time, where x can be selected by the client.  The client should also be able to request and receive a specific page of data by entering the desired page number.

There ARE ways of accomplishing all of this with static SQL, but the end result is a massively complex query that is not easily maintainable.  The sorting alone is a mess, requiring a SQL CASE statement that needs the result of each case to be cast to the same type (see Dynamic ORDER BY Statements in Stored Procedures).  Using multiple criteria per field requires passing an array of values as a single parameter, which then needs to be split back into separate values and integrated into the query (see Arrays and Lists in SQL Server 2005 for what this looks like in SQL Server).  Optional criteria requires testing the parameter against NULL as part of the WHERE clause, e.g. "WHERE CustomerID = @customer_id  OR @customer_id IS NULL."

Thankfully, with the growing acceptance of Object Relational Mapping (ORM) and the Dynamic Query Engines (DQE) that they provide, creating powerful, flexible queries that support highly complex criteria has become much easier.  In this series of articles, I will demonstrate harnessing the power of a DQE to meet the above requirements (and beyond?) in a type-safe and highly maintainable way.

Implementation Details

I will be using LLBLGen Pro (v2.6) as the ORM/DQE. More specifically, I will be using the Dynamic List feature of LLBL, which results in vanilla DataTables/DataSets as a query result.  Although DataSets (vanilla or typed) are not a good substitute for real business objects, I find them extremely useful for representing lists of data, particularly when the data includes columns from multiple database entities.  A flat list of data is frequently convenient for iteration when data binding (in Web/WinForms) or generating HTML (in MVC).

While this article will focus on creating a set of tools to be used in conjunction with LLBLGen and DataSets, the overall intent is to demonstrate the concept of harnessing any DQE.  This same pattern is equally applicable to using full-featured business/domain objects, and should be possible using any mature ORM framework, including NHibernate, Entity Framework, or Linq To SQL.  I am choosing LLBLGen as a DQE because it is the one I am most familiar with, and I find its predicate syntax to be very intuitive and expressive.

I will be using SQL 2005 as a database engine, and the Northwinds database as a target schema.  You won’t need a full version of LLBLGen to tinker with the sample code, as I will provide the LLBL-generated code, which is freely distributable.

Solution Overview

The basic concept is really quite simple.  Create a class (essentially a DTO) that represents all the possible criteria the client requires.  Then create a corresponding class in the Data Access Layer (DAL) that can translate the values from the DTO into a dynamic query.

The client only needs to have knowledge of the DTO, the service, and the type that is returned by the service (Business Object, DataSet, List, etc.).  In the case of my example, some knowledge of what will be provided in the DataSet (column names, for example) will be required as well, since DataSet columns are not typed.

That’s really all there is to it.  As simple as it sounds, I have found it to be a very effective way to quickly create powerful searches that perform well and are easy to maintain.

A Super-Basic Example

Before getting too deep into creating abstractions, wrappers, and helper classes for our framework, let’s start with a concrete example of how this is going to work.  Let’s temporarily pare the requirements down to:

Show all orders, optionally filtered by any combination of: a customer, occurring after start date, occurring before an end date.

This is intended as an illustration only—if the requirements were really this simple, a more straightforward approach, like a single method that takes in three parameters, would probably be a more appropriate solution.

DTO class for representing all the possible criteria:

public class OrderSearchParamV1
{
    public string CustomerId { get; set; }
    public DateTime? StartDate { get; set;}
    public DateTime? EndDate { get; set; }
}

Result class (the return value of the service):

public class SearchQueryResult
{
    public int TotalRows { get; private set; }
    public DataTable Data { get; private set; }

    public SearchQueryResult(DataTable data, int totalRows)
    {
        if (data == null) throw new ArgumentNullException("data");
        Data = data;
        TotalRows = totalRows;
    }
}

Now the LLBL code.  Comments on what each section is doing are inline:

public class OrderSearchServiceV1
    {
        public SearchQueryResult SearchOrders(OrderSearchParamV1 param)
        {
            //create the fields.  This is roughly the equivalent of the SELECT statement
            var fields = new ResultsetFields(6);
            fields[0] = OrdersFields.OrderId;
            fields[1] = OrdersFields.OrderDate;
            fields[2] = OrdersFields.ShippedDate;
            fields[3] = CustomersFields.CompanyName;
            fields[4] = new EntityField2("EmployeeName", EmployeesFields.FirstName +
                " " + EmployeesFields.LastName);
            fields.DefineField(OrderDetailsFields.ProductId, 5, "OrderDetailCount", AggregateFunction.Count);

            //Setup a GROUP BY clause for non-aggregate fields
            var groupBy = new GroupByCollection();
            groupBy.Add(fields[0]);
            groupBy.Add(fields[1]);
            groupBy.Add(fields[2]);
            groupBy.Add(fields[3]);
            groupBy.Add(fields[4]);
            groupBy.Add(EmployeesFields.FirstName);
            groupBy.Add(EmployeesFields.LastName);
            
            //setup relations.  This is roughly the equivalent of the FROM and JOIN statements
            var bucket = new RelationPredicateBucket();
            bucket.Relations.Add(OrdersEntity.Relations.CustomersEntityUsingCustomerId);
            bucket.Relations.Add(OrdersEntity.Relations.EmployeesEntityUsingEmployeeId);
            bucket.Relations.Add(OrdersEntity.Relations.OrderDetailsEntityUsingOrderId);

            //filter by customerId if it was provided
            if (param.CustomerId != null)
            {
                bucket.PredicateExpression.Add(OrdersFields.CustomerId == param.CustomerId);
            }

            //filter by StartDate if it was provided
            if (param.StartDate.HasValue)
            {
                bucket.PredicateExpression.Add(OrdersFields.OrderDate >= param.StartDate);
            }

            //filter by EndDate if it was provided
            if (param.EndDate.HasValue)
            {
                bucket.PredicateExpression.Add(OrdersFields.OrderDate <= param.EndDate);
            }

            int dbCount;
            var dt = new DataTable();
            using (var adapter = new DataAccessAdapter())
            {
                //fetch the data table
                adapter.FetchTypedList(fields, dt, bucket, 0, null, false, groupBy);
                //perform an additional query to get the total count of rows that matched
                dbCount = adapter.GetDbCount(fields, bucket, groupBy, false);
            }

            return new SearchQueryResult(dt, dbCount);
        }
    }

I know what you’re thinking: that looks like a lot of code to do something relatively simple.  But remember that:

  1. Our “real” requirements are going to add much more complexity, and
  2. we’re going to create a more refined toolset to make much of this code reusable.

In my next post, I will create the first draft of our “Dynamic List Wrapper”, which will make creating fields, relations, and aggregations easier, and will centralize the code that actually makes the calls to the database.

Source Code for Part 1

Shout it

Blog at WordPress.com.