Thursday, October 18, 2007

LINQ to SQL Has No "Out-of-the-Box Multi-Tier Story!"

The LINQ to SQL Team bites the bullet: Dinesh Kulkarni admits in his October 15, 2007 LINQ to SQL: What is NOT in RTM (V1) post that LINQ to SQL v1 will be missing (at least) the following features:

  1. Support for other databases with or without a public provider model.
  2. Out-of-the-box multi-tier story.
  3. Handling schema changes.
  4. Support for specific mappings (value types and m:n relationships with no relation table.)
  5. Identity between designer, SqlMetal and run-time (external mapping file and SSC support in designer)
  6. LINQ to SQL in compact framework.

[I've shortened and edited the feature descriptions for emphasis and further emphasized point 2.]

Dinesh says the missing features are ranked by "frequency/strength of the feedback and consistency with what LINQ to SQL was designed for." In my view, LINQ to SQL was never intended to support databases other than SQL Server (including Compact Edition); its expression trees are connected at the hip to Transact-SQL. This makes lack of an "Out-of-the-box multi-tier story" my number one missing feature of LINQ to SQL's v1 release.

The Missing Multi-Tier and Remoting Story

The problem I see is that LINQ to SQL has no multi-tier or remoting story. Searching for 'tier' in the LINQ Project General Forum today returned 172 hits. Most questions concerned the following three- or four-tier scenario:

  1. Request with a single message from the UI tier a set of related business objects for editing from a Windows Communication Foundation (WCF) service client that uses HTTP[S] or TCP-IP as the transport.
  2. Supply the business objects in a single message from a business services tier (BSL) that in turn connects to a LINQ to SQL persistence layer (PL, a.k.a. Data Access Layer or DAL). The BSL provides authentication and authorization services; WCF handles digital signing, encryption, and reliable messaging operations.
  3. Add, edit, and/or remove one or more root and/or related business objects in the UI, which is most commonly an ASP.NET Web page.
  4. Send all modified root and related business objects to the BSL in as few messages as possible, validate the changes in the BSL, update the local business objects and persist the updated objects in the PL, which reports concurrency conflicts to the BSL (and UI, if necessary) for resolution.

In some questions, the BSL and DAL/PL are combined in a single layer.

If you replace the term business objects with DataTables from a DataSet, the preceding scenario is in common use today with .NET 2.0 Web services. .NET 3.0's WCF simplifies creating and managing services that use HTTP and TCP-IP transports for Internet services and intranet remoting. As noted in my LINQ to SQL and Entity Framework XML Serialization Issues with WCF - Part 1 post of October 9, 2007, the DataContractSerializer serializes EntitySets by default and EntityRefs with a relatively simple patch.

The LINQ Project General Forum's Update existing disconnected objects (DLinq) thread that started February 1, 2007 is typical of requests for information on how to implement step 4 of the preceding scenario. Here's Matt Warren's February 3, 2007 response to Zoinky's question of February 2:

The usage pattern you are describing is not supported by the feature set currently available.  The only scenario with 'disconnected' objects supported so far requires you to re-attach your objects and play-back the changes against them. This is intended for a multi-tier scenario where you are responsible for serializing the object's data to the client tier or browser and back again along with change information.  More DLINQ features in this area are still to come.

The scenario you are describing sounds like it would be better served as a two-tier model using normal attached objects databound to UI.

There are four problems with the preceding answer:

  1. Re-attaching is "intended for a multi-tier scenario", but Zoinky's typical multi-tier scenario isn't supported. Now there's no support for multi-tier scenarios.
  2. Playing back changes implies sending an individual message for each change, which results in an CRUDy (chatty) service, which is an anti-pattern.
  3. More LINQ to SQL "features in this area" didn't come, as noted in the next topic.
  4. If your feature doesn't support three or more tiers, suggesting that two-tier would be better won't suffice. It's obvious that Zoinky and others requesting a solution to three-tier issues would use two tiers if it suited their requirements.

After complaints about lack of a three-tier solution, Matt concluded the thread on April 12, 2007 with the following:

The reconnecting API's you have available to you at this time do not support the scenario you guys are discussing. There is more work coming that makes 3-tier scenario's easier. The re-attach and playback API's are included because they are the bare minimum required to make any reconnecting work at all, however cumbersome it may seem at this time. There is an upcoming feature that automatically databinds LINQ to SQL objects directly to webforms that use these API's to communicate changes back into the DataContext. 

There is also a technology in the works that automatically serializes LINQ to SQL objects to other tiers, change tracks and data binds the objects for you on that tier and serializes the objects back, solving the cyclic reference serialization problem and the indescribable schema problem for objects with change sets; which seems to match what many of you are describing.

Here's my take on this answer:  

  1. The "upcoming feature that automatically databinds LINQ to SQL objects directly to webforms that use these API's to communicate changes back into the DataContext" is ASP.NET 3.5's LinqDataSource component, which must connect to a local DataContext and dictates client/server (two-tier) architecture.
  2. I haven't seen a "playback API." The above answer is the only LINQ-related original document returned by a Google search on '"playback API" LINQ -MIDI'.
  3. We're still waiting for the "technology in the works that automatically serializes LINQ to SQL objects to other tiers," which turned out to be the phantom "mini connectionless DataContext" described in my Changes Coming to LINQ to SQL post of May 15, 2007.

And here's Matt's announcement of the mini connectionless DataContext's demise in LINQ to SQL v1 from the Forum's LINQ: Next CTP thread:

No, this feature will not be in the first release of LINQ to SQL. (June 19, 2007)

It is only missing because of lack of time.  When I mentioned it, I was talking about our forward thinking and a sample I'm trying to put together. (June 27, 2007)

Needless to say, no sample has been forthcoming as of mid-October.

Considering the Microsoft Developer Division's emphasis on tiered data-access architectures, initially with Web services and now with WCF, "no out-of-the-box multi-tier support" by what promised to be the most widely used LINQ implementation boggles the imagination.

Emulating the Phantom Mini Connectionless DataContext

The "Mike Taulty Implements an Unbound DataContext in a WCF Client" topic of my LINQ and Entity Framework Posts for 10/5/2007+ post points to an article that describes how to write your own connectionless DataContext class and related code to support disconnected addition, modification, and deletion of a root object without child objects. Mike's ClientSideContext class wraps a standard DataContext that has an empty connection string and uses the DataContext.GetTable() method to deliver Table<TEntity> objects without sending T-SQL queries to a non-existent persistence store. The client and server apps share a common NorthwindDataContext class (called DataTypes) generated by SqlMetal.exe.

Note: The client's DataContext class copy includes persistence-related [Table ...] and  [Column ...] attributes, as well as connection string information that could be removed to provide persistence ignorance.

Supporting Associated Objects

I modified Mike's approach by using the O/R Designer to create the Northwind.dbml and related designer files for Customer, Order, and Order_Detail entities, created a Data Source from the Customer entity, and dragged-and-dropped DataGridView controls with BindingSource components bound to the the Customer, Customer.Orders, and Order.Order_Details entity and EntitySets. (The DataContext class doesn't serialize EntityRefs because WCF's DataContractSerializer doesn't support cyclic references unless you add a source code patch.)

Note: I use the parent/child/grandchild grids and a new Customer/Order/Order_Details item for almost all Windows form test harnesses. Failing to reproduce the archtypical DataSet demo with hierarchical data sources and associated EntitySets and EntityRefs has left several LINQ to SQL bugs undetected by the LINQ to SQL team's QA folks. Here's a screen capture of the ContextClient.exe test harness that resembles most others I've built:

The time shown in the preceding capture is that for the client to send a SOAP response message (GetUSCustomers.xml, 177 KB) for the 14 US Customer objects (including BOGUS), with include their associated Order and Order_Detail object instances.

The following capture shows the ContextService.exe harness with part of the T-SQL batch for deleting the BOGUS Customer and it's dependent Order and Order_Detail objects:

The full T-SQL batch for BOGUS order deletion with provision for referential integrity and value-based concurrency management is:

Method: DeleteCustomers

DELETE FROM [dbo].[Order Details] WHERE ([OrderID] = @p0) AND ([ProductID] = @p1) AND ([UnitPrice] = @p2) AND ([Quantity] = @p3) AND ([Discount] = @p4)
-- @p0: Input Int32 (Size = 0; Prec = 0; Scale = 0) [12278]
-- @p1: Input Int32 (Size = 0; Prec = 0; Scale = 0) [1]
-- @p2: Input Currency (Size = 0; Prec = 19; Scale = 4) [10.0000]
-- @p3: Input Int16 (Size = 0; Prec = 0; Scale = 0) [12]
-- @p4: Input Single (Size = 0; Prec = 0; Scale = 0) [0.05]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.20706.1

DELETE FROM [dbo].[Order Details] WHERE ([OrderID] = @p0) AND ([ProductID] = @p1) AND ([UnitPrice] = @p2) AND ([Quantity] = @p3) AND ([Discount] = @p4)
-- @p0: Input Int32 (Size = 0; Prec = 0; Scale = 0) [12278]
-- @p1: Input Int32 (Size = 0; Prec = 0; Scale = 0) [2]
-- @p2: Input Currency (Size = 0; Prec = 19; Scale = 4) [20.0000]
-- @p3: Input Int16 (Size = 0; Prec = 0; Scale = 0) [24]
-- @p4: Input Single (Size = 0; Prec = 0; Scale = 0) [0.1]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.20706.1

DELETE FROM [dbo].[Orders] WHERE ([OrderID] = @p0) AND ([CustomerID] = @p1) AND ([EmployeeID] = @p2) AND ([OrderDate] = @p3) AND ([RequiredDate] = @p4) AND ([ShippedDate] IS NULL) AND ([ShipVia] = @p5) AND ([Freight] = @p6) AND ([ShipName] = @p7) AND ([ShipAddress] = @p8) AND ([ShipCity] = @p9) AND ([ShipRegion] = @p10) AND ([ShipPostalCode] = @p11) AND ([ShipCountry] = @p12)
-- @p0: Input Int32 (Size = 0; Prec = 0; Scale = 0) [12278]
-- @p1: Input String (Size = 5; Prec = 0; Scale = 0) [BOGUS]
-- @p2: Input Int32 (Size = 0; Prec = 0; Scale = 0) [1]
-- @p3: Input DateTime (Size = 0; Prec = 0; Scale = 0) [10/18/2007 12:00:00 AM]
-- @p4: Input DateTime (Size = 0; Prec = 0; Scale = 0) [11/1/2007 12:00:00 AM]
-- @p5: Input Int32 (Size = 0; Prec = 0; Scale = 0) [1]
-- @p6: Input Currency (Size = 0; Prec = 19; Scale = 4) [10.0000]
-- @p7: Input String (Size = 20; Prec = 0; Scale = 0) [Bogus Software, Inc.]
-- @p8: Input String (Size = 13; Prec = 0; Scale = 0) [1220 Broadway]
-- @p9: Input String (Size = 7; Prec = 0; Scale = 0) [Oakland]
-- @p10: Input String (Size = 2; Prec = 0; Scale = 0) [CA]
-- @p11: Input String (Size = 5; Prec = 0; Scale = 0) [94612]
-- @p12: Input String (Size = 3; Prec = 0; Scale = 0) [USA]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.20706.1

DELETE FROM [dbo].[Customers] WHERE ([CustomerID] = @p0) AND ([CompanyName] = @p1) AND ([ContactName] = @p2) AND ([ContactTitle] = @p3) AND ([Address] = @p4) AND ([City] = @p5) AND ([Region] = @p6) AND ([PostalCode] = @p7) AND ([Country] = @p8) AND ([Phone] = @p9) AND ([Fax] = @p10)
-- @p0: Input StringFixedLength (Size = 5; Prec = 0; Scale = 0) [BOGUS]
-- @p1: Input String (Size = 20; Prec = 0; Scale = 0) [Bogus Software, Inc.]
-- @p2: Input String (Size = 9; Prec = 0; Scale = 0) [Joe Bogus]
-- @p3: Input String (Size = 10; Prec = 0; Scale = 0) [Proprietor]
-- @p4: Input String (Size = 13; Prec = 0; Scale = 0) [1220 Broadway]
-- @p5: Input String (Size = 7; Prec = 0; Scale = 0) [Oakland]
-- @p6: Input String (Size = 2; Prec = 0; Scale = 0) [CA]
-- @p7: Input String (Size = 5; Prec = 0; Scale = 0) [94612]
-- @p8: Input String (Size = 3; Prec = 0; Scale = 0) [USA]
-- @p9: Input String (Size = 14; Prec = 0; Scale = 0) [(510) 555-1212]
-- @p10: Input String (Size = 14; Prec = 0; Scale = 0) [(510) 555-1213]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.20706.1

Elapsed time: 0.550 seconds.

The corresponding SOAP message from the WCF Service Trace Viewer is 3,242 bytes; click the link to view the DeleteBogus.xml trace in IE 6/7. Notice that the <before ...> node includes a serialized <a:Orders> node and the <after ...> node doesn't. You must Insert() an object before you can Remove() it. In this case the original and current values are swapped so that the before node has the original values for Order and Order_Detail EntitySets and it wasn't necessary to execute the code to add them.

Adding Missing Orders and Order_Details EntitySets to Original Values 

I'm still hoping for an explanation of the lack of support by the default DataContext's GetOriginalEntityState(modifiedEntity) method for EntitySets. Here's a copy of Matt Warren's comment to the "LINQ to SQL's GetOriginalEntityState() Method Has Serious Bug" topic of my LINQ and Entity Framework Posts for 10/15+ post:

Roger, its not possible for GetOriginalEntityState to return copies of entities with EntityRef's and EntitySet's set. These associations are normally bi-directional, so only one entity instance can ever refer to another without duplicating the entire graph. The purpose of GetOriginalEntityState is to give back a copy of the entity with only the data fields set.

The upshot of this message is that the GetOriginalEntityState() method returns what amounts to a shallow clone of the desired entity when what's needed is the equivalent of a deep clone. (For more about this topic, see An insight into cloning objects in .NET.)

My current workaround for this problem is to create List<Customer> lists from two individual SOAP requests to the client and not attach the second list to the ClientSideContext. (I'm investigating more efficient techniques, such as creating a deep clone of the original EntitySets on loading data.)

Here's the client-side code that attaches the first (customers) list to the ClientSideContext to make EntitySets accessible for editing or deletion:

ctx = new ClientSideContext();
foreach (Customer c in customers)
{
    ctx.Attach(c);
    // Attach the Orders entity from its EntitySet
    // EntitySet objects don't have an iterator
    Order ord = null;
    for (int i = 0; i < c.Orders.Count; i++)
    {
        ord = c.Orders[i];
        ctx.Attach(ord);
        // Attach the Order_Details entity from its EntitySet      
        for (int j = 0; j < ord.Order_Details.Count; j++)
        {
            ctx.Attach(ord.Order_Details[j]);
        }

    }
}

Note: The preceding is a modification of Rick Strahl's approach in his Complex Detached Entities in LINQ to SQL - more Nightmares post of October 1, 2007.

The SOAP request message to update the BOGUS Customer with a minor edit is 5,043 bytes in the WCF Service Trace Viewer; click the link to view the UpdateBogus.xml trace, which has only a change to the <ContactName> element value, in IE 6/7. Notice that both the <after ...> and <before ...>  nodes include serialized <a:Orders> nodes. (Code adds the <a:Orders> nodes to the <before ...> node immediately prior to sending the message.) The current and original values are required for comparison when returning the entire Customer object to the service for updates.

Here's the code to add the original EntityRef values from the origCusts List<Customer> that result in the message's <a:Orders> subnodes of the <before ...> (original values) node:

// Modify Customer objects, including EntitySets
var modified = ctx.GetModified<Customer>();
if (ctx.GetModified<Customer>().Current.Count > 0)
{
    foreach (Customer c in modified.Originals)
    {
        if (c.Orders.Count == 0)
        {
            foreach (Customer o in origCusts)
            {
                if (o.CustomerID == c.CustomerID)
                // Add EntityRefs to modified.Originals
                {
                    // Adding items removes them from origCusts.Orders
                    int ordCount = o.Orders.Count;
                    for (int i = 0; i < ordCount; i++)
                    {
                        c.Orders.Add(o.Orders[0]);
                    }
                    break;
                }
            }
        }
    }
    proxy.UpdateCustomers(modified.Current, modified.Originals);
}

Comparing the elements of the <after ...> and <before ...> nodes' <a:Orders> subnodes of UpdateBogusOrders.xml proves that the preceding code works. The <after ...> node reflects DataGridView changes to the <ContactName>, <EmployeeID>, <ShippedDate>, <ProductID> and <ShippedDate> element values. It's impossible to prove that the service updates the order correctly because the blocking 295402 Incorrect Update Behavior - Attaching an Entity after Calling a Relationship Property bug reported on 8/27/2007 causes EntityRef updates to insert rather than update orders. (The inserted Order and Order Detail records reflect the DataGridView edits.)

As is the case with two-tier (client/server) implementations, only object updates in DataGridViews work as expected, but edits to EntityRef members don't mark the root object dirty (reported as bug ID 305828, Change to LINQ to SQL EntitySet Member Doesn't Mark Root Object Dirty.) You also must write code to handle adding and removing EntityRef members, as well as a hack to add the root object to the lists for updates.

Updated 10/19/2007 1300 PST: Clarified, added code and provided an additional XML document link to the preceding section.

Future Work Required

For this initial post, I've implemented Mike's proxy.InsertCustomers(ctx.GetInserted<Customer>()), proxy.DeleteCustomers(deleted.Current, deleted.Originals) and proxy.UpdateCustomers(modified.Current, modified.Originals) methods for one or more Customer objects with two levels of sub-objects. This demonstrates that there is a multi-tier story for LINQ to SQL but it's unlikely to win a Pulitzer prize. In fact, it's a downright discouraging story.

The next steps are to investigate tradeoffs between CRUDy (chatty) proxy.InsertOrders(), proxy.DeleteOrders(), proxy.UpdateOrders(), proxy.InsertOrder_Details(), proxy.DeleteOrder_Details(), and proxy.UpdateOrder_Details() methods versus using chunky proxy.UpdateCustomers() to handle changes to child objects. Tests with ASP.NET and the LinqDataSource also are in the works. (The LinqDataSource requires a DataContext as its data source.)

Here's Hoping that the Entity Framework Team Doesn't Fall into this Trap

Mike Taulty discusses strangeness in Entity Framework deletions in his October 3, 2007 LINQ to Entities - Deleting and August 27, 2007 Entity Framework - Object Services Level. Deleting and ObjectStateManager posts. LINQ to SQL has similar oddities with removing object instances from the persistence store when deleting rows in bound DataGridViews, as described in my 10/23/2007 The LINQ to SQL DataBinding Story post. (Coming soon.)

He also compares the problem of managing optimistic concurrency with the Entity Framework and LINQ to SQL in his September 19, 2007 Disconnected LINQ to Entities post. Hopefully, Mike will soon try a similar hack to create a "maxi connectionless ObjectContext."

Updated 10/23/2007 0930 PST: Added and updated content to this section.

1 comments:

Captain Ramen said...

A lot of emphasis on this topic has been put in the C UD of CRUD ("...points to an article that describes how to write your own connectionless DataContext class and related code to support disconnected addition, modification, and deletion of a root object without child objects."), but not the R.

IMO, the data access layer should contain just the CRUD methods, and should be seperate from the persistence layer.

The best way to achive this is to serialize the linq query before sendint it up the wire... unless of course this functionality already exists or is easy to implement (I am not talking about the dynamic linq sample here, because there is no compile time checking, and I might as well send the SQL directly) and I have forgotten to rtfm.