This is from the team that brought you ‘Framework Design Guidelines’ led by Krzysztof Cwalina, an individual we are not shy to recognize as an amazing mind.
Look for this content (as shared here from their internal work) in the next version of the book
A Quick Overview
-
The Query Pattern, a specification of the set of methods a type must provide in order to be considered as a LINQ provider. The names of these methods can be found in System.Linq.Enumerable (found in System.Core). Their signature follows a pattern that will be discussed later in this chapter.
-
A definition of a new interface, IQueryable<T> :I Enumerable<T>
-
New types representing generic delegates: Func<> and Action<>
-
Lambda expressions, a language feature in VB and C# for defining anonymous delegate types representing the notion of a delay-compiled delegate, the Expression<> family of types.
-
An extension to the C# and VB syntax allowing for queries to be expressed in an alternative, SQL-like format. For example (C#):
|
// Extension Method IEnumerable<String> names = set.Where(x => x.Age > 20).Select(x => x.Name); // SQL Style IEnumerable<string> names = from x in set |
The interplay between these features is the following: any IEnumerable<> can be queried upon using the LINQ extension methods, most of which require one or more lambda expressions as parameters; this leads to an in-memory, generic evaluation of the queries. For cases where the set of data is not in memory (e.g. in a database), and/or queries may be optimized, the set of data is presented as an IQueryable<>. If lambda expressions are given as parameters, they are transformed by the compiler to Expression<> objects. The implementation of IQueryable<> is responsible for processing said expressions. For example, the implementation of an IQueryable<> representing a database table would translate Expression<> objects to SQL queries.
2. Extension Methods, Func<>, Action<>, and Expression<>
Extension methods constitute a language feature that allows static methods to be invoked on instance variables. These methods must take at least one parameter, which represents the instance the method is to operate on. For example, in C#, this is done by using the this modifier on such a parameter, when defining the method:
}
…which in this case would result in false
In this example, divide is a function that takes two integers and returns a double. The last parameter in the generic definition of a Func<> is always the return type. To represent void-returning functions, use Action<>
Notice how the syntax for constructing an Expression<> object is very similar to the one used to construct a Func<> object; in fact, the only difference is the static type declaration of the variable (Expression<> instead of Func<>
Expressions may also be constructed programmatically, using the predefined static methods of the non-generic Expression
ý Avoid
þ Consider
-
to provide helper functionality relevant to every implementation of an interface, if said functionality can be written in terms of the core interface. This is because concrete implementations cannot otherwise be assigned to interfaces. For example, the LINQ to Objects operators are implemented as extension methods for all IEnumerable<T> types. Thus, any IEnumerable<>
-
when object model considerations would dictate taking a dependency on some assembly, but taking such a dependency would break dependency management rules.
For example, consider the domain of telecommunications. Let us consider that a core domain-specific model is that of network nodes. A library is defined for this domain, called nodes.dll. The library contains types such as Node. Node may have properties such as “address” and “manufacturer”, etc. Now consider the problem of establishing communication paths through such a network (i.e. routing). A second library, routing.dll, is thus developed, taking a dependency on nodes.dll. Good engineering practice dictates that nodes.dll not take a dependency on routing.dll. However, it could be imagined that, in the domain of path routing, the model for network nodes may have new characteristics, namely, that of being or not a path endpoint, and that of being associated with a number of paths. To maintain clean dependencies, a possibility is to have static methods defined in routing.dll, e.g. Paths.IsNodePathEndpoint(Node n) or Paths.GetListOfPathsCrossingNode(Node n). However, it may be considered more natural to simply have methods like IsEndpoint() defined on Node
ý Avoid defining extension methods on System.Object
This is because, in VB, declaring a variable as object
In this example, the call to Foo
ý Do not
ý Avoid redefining extension methods on a type T
This will fail at compile time, since ExtMethodis considered to be defined twice. This may break valid reusability scenarios; if the scenarios occur late in the development cycle, refactoring namespaces may prove costly. It is thus preferable to avoid this form of overriding methods.
|
Mircea Trofin
Such a situation may occur when using a mix of third-party
libraries. To resolve it, import in a file only one namespace
defining extension methods on a type, and use fully-qualified static
method calls for extension methods defined on the same type in a
different namespace.
Using the example before:
<file 3>
using A;
…
T someObj=…
someObj.ExtMethod(); //this calls A.ExtMethodsInA.ExtMethod
B. B.ExtMethodsInB.ExtMethod(someObj); //to avoid compilation errors of file 3, we call explicitly the extension method defined in namespace B |
þ Do
ý Do not
For example, do not define extension methods for Telecom.Node in the Telecom namespace, but in a separate namespace, e.g. Routing
ý Avoid generic naming of namespaces dedicated to extension methods (e.g. “Extensions”) – rather, use a descriptive name – e.g. “Routing
þ Do use the new LINQ types “Func<>” and “Expression<>” instead of custom delegates and predicates, when defining new APIs. The example below illustrates this point:
Benefits of using Func<>, in this case, include minimizing the concept count by employing an existing abstraction, as well as eliminating the dependency on the definition of Tester.
3. Extending on LINQ
There are three means by which a type can be designed to participate in LINQ queries: implementing IEnumerable<> (or interfaces derived from it), implementing IQueryable<>
-
Extend IEnumerable<>
-
Extend IQueryable<>
-
Define the Query Pattern on the new type if the domain makes it impossible or undesirable to implement IEnumerable<> or IQueryable<>
The query methods are those defined by the sponsor class System.Linq.Enumerable, as extension methods to IEnumerable<>
þ Do respect the signature pattern in Figure 1
We use S with or without subscript to indicate a collection type (e.g. IEnumerable<>, ICollection<>) and T, with or without subscript, to indicate the type of elements in that collection. Additionally, we use O<T> to represent subtypes of S<T> that are ordered. For example, S<T> is a notation that could be substituted with IEnumerable<int>. The first parameter of all the methods is the type of the object the method is applied on, and it is marked accordingly by prefixing with “this” – whether one chooses to implement these methods as extension methods or as member methods. Also, anywhere Func<> is being used, one may substitute accordingly with Expression<Func<>>
|
S<T> Where(this S<T>, Func<T,bool>)
S<T2> Select(this S<T1>,Func<T1,T2>)
O<T> OrderBy(this S<T>, Func<T,K>), where K is IComparable
O<T> ThenBy(this O<T>,Func<T,K>), where K is IComparable
S<T> Union(this S<T>, S<T>)
S<T> Skip(this S<T>,int)
S<T> Take(this S<T>,int)
S<T> SkipWhile(this S<T>, Func<T,bool>)
T ElementAt(this S<T>,int)
S<T3> Join(this S<T1>, S<T2>, Func<T1,K1>,Func<T2,K2>,
Func<T1,T2,T3>)
S<T3> SelectMany(this S<T1>,Func<T1,S<T2>>,Func<T1,T2,T3>)
S<T2> SelectMany(this S<T1>,Func<T1,S<T2>>)
|
Figure 1 Query Pattern Method Signatures
Extending IEnumerable<>
þ Do implement IEnumerable<T> to enable basic LINQ support. Such basic support should be sufficient for most in-memory data sets. The basic LINQ support will use the default -provided extension methods on IEnumerable<T>
Although RangeOfInts did not implement a “Where
þ Consider redefining the methods in the LINQ pattern on new types implementing IEnumerable<T>
þ Consider implementing ICollection<T>
For example, the Count<> method default behavior (as defined in System.Linq) is to simply walk the IEnumerable
Extending IQueryable<>
þ Consider implementing IQueryable<T>
ý Do not implement IQueryable<T>
· Building and parsing expression trees is expensive and many queries can actually get slower when IQueryable<T>
·
þ Do throw NotSupportedException from IQueryable<T>
For example, imagine representing a media stream (like internet radio) as an IQueryable<>. The Count
Implementing the Query Pattern
The Query Pattern refers to defining the methods in Figure 1 without implemeting the IEnumerable<> or IQueryable<>
þ Do implement at least the enumerable pattern (i.e. provide a GetEnumerator
þ Do
þ Do represent ordered sequences as a separate type. Define on this type the “ThenBy
This follows the current pattern in the LINQ-to objects implementation, as well as allows for early (compile-time) detection of errors such as applying “ThenBy” to a not “OrderBy
For example, the framework provides the IOrderedEnumerable<> type, which is returned by “OrderBy”. The “ThenBy” extension method is defined for this type, and not for IEnumerable<>
þ Consider designing the LINQ operators to return specific enumerable types for domains that should be restricted. Essentially, one is free to return anything from a Select query method, however, the expectation is that the query result type should be at least enumerable in a foreach
þ Do
The following methods are exception from this rule: All, Any, Average, Contains, Count, ElementAt, Empty, First, FirstOrDefault, Last, LastOrDefault, Max, Min, Single, Sum.
In the example above, the expectation is that the time necessary for evaluating the second line be be independent from the size or nature (e.g. in-memory or remote server) of set1. The general expectation is that this line simply prepares set2
ý Avoid implementing just a part of the Query Pattern, if fallback to the basic IEnumerable<T>
For example, consider a user-defined type T, which implements IEnumerable<>. T has an override for Count, but not for Where
In this example, any opportunities for optimization are lost after the Where call. The Count version used is the one defined for IEnumerable<>.
þ Do place query extensions methods in a “Linq” sub-namespace of the main namespace. For example, extension methods for System.Data features reside in System.Data.Linq
þ Do use Expression<Func<>> as a parameter instead of Func<>
As it was discussed, interacting with an SQL database is already done through IQueryable<T> rather than IEnumerable<T>
An alternative reason for implementing IQueryable<T> is performing optimizations. For example, a sorted list can implement lookup (“Where” clauses) with binary search, which can be much more efficient than the standard IEnumerable<T> or IQueryable<T> implementations.




