OakLeaf Systems: Windows Azure and Cloud Computing Posts for 6/10/2011+

A compendium of Windows Azure, Windows Azure Platform Appliance, SQL Azure Database, AppFabric and other cloud-computing articles.

•• Updated 6/12/2011 with articles marked •• from Steve Marx, Jeremy Ashkenas, Julie Lerman, Maureen O’Gara, Arthur Cole, Pierre Menard, Pablo Castro, Sidharth Ghag, Microsoft Talent Network, Haddicus, Thomas Conté, Anthony Savvas and Valery Mizonov.

• Updated 6/11/2011 with articles marked • from Klint Finley, Michael Washington, Michael Washam, Clemens Vasters, Murali Krishnaprasad, Azret Botash, AppFabric Team Blog, Eve Maler and Ralf Handl.

Note: This post is updated daily or more frequently, depending on the availability of new articles in the following sections:

Azure Blob, Drive, Table and Queue Services
SQL Azure Database and Reporting
Marketplace DataMarket and OData
Windows Azure AppFabric: Access Control, WIF and Service Bus
Windows Azure VM Role, Virtual Network, Connect, RDP and CDN
Live Windows Azure Apps, APIs, Tools and Test Harnesses
Visual Studio LightSwitch and Entity Framework 4.1+
Windows Azure Infrastructur and DevOps
Windows Azure Platform Appliance (WAPA), Hyper-V and Private/Hybrid Clouds
Cloud Security and Governance
Cloud Computing Events
Other Cloud Computing Platforms and Services

To use the above links, first click the post’s title to display the single article you want to navigate.

Azure Blob, Drive, Table and Queue Services

•• Valery Mizonov described Implementing Storage Abstraction Layer to Support Very Large Messages in Windows Azure Queues in a 6/7/2011 post to the AppFabricCAT blog that released on 6/12/2011:

The following blog post is intended to offer a developer-oriented perspective on the implementation of a generics-aware storage abstraction layer for the Windows Azure Queue Service. The problem space addressed by the post is centered on supporting very large messages in Windows Azure queues and overcoming the 8KB limitation that exists today. Simply put, this blog and the associated code will enable you to utilize Windows Azure Queues without having to engage in the message size bookkeeping imposed by the queue’s 8KB limit.

Why Large Messages?

There were wonderful times when “640K ought to be enough for anybody”. A few kilobytes could buy a luxurious storage space where a disciplined developer from the past was able to happily put all of her application’s data. Today, the amount of data that modern applications need to be able to exchange can vary substantially. Whether it’s a tiny HL7 message or multi-megabyte EDI document, modern applications have to deal with all sorts of volumetric characteristics evolving with unpredictable velocity. A business object that was expressed in a multi-byte structure in the last century may easily present itself today as a storage-hungry artifact several times large than its predecessor thanks to modern serialization and representation techniques and formats.

Handling messages in a given solution architecture without imposing technical limitations on message size is the key to supporting ever-evolving data volumes. Large messages cannot always be avoided. For instance, if a B2B solution is designed to handle EDI traffic, the solution needs to be prepared to receive the EDI documents up to several megabytes. Every tier, service and component in the end-to-end flow needs to accommodate the document size that is being processed. Successfully accepting a 20MB EDI 846 Inventory Advice document via a Web Service but failing to store it in a queue for processing due to the queue’s message size constraints would be considered as unpleasant discovery during testing.

Why would someone choose to use a queue for large messages on the Windows Azure platform? What’s wrong with other alternatives such as blobs, tables, cloud drives or SQL Azure databases to say the least? Mind you, the queues allow implementing certain types of messaging scenarios that are characterized by asynchronous, loosely-coupled communications between producers and consumers performed in a scalable and reliable fashion. The use of Windows Azure queues decouples different parts of a given solution and offers unique semantics such as FIFO (First In, First Out) and At-Least-Once delivery. Such semantics can be somewhat difficult to implement using the other alternative data exchange mechanisms. Furthermore, the queues are best suited as a volatile store for exchanging data between services, tiers and applications, not as persistent data storage. The respective data exchange requirement can manifest itself in many different forms such as passing messages between components in asynchronous manner, load leveling, or scaling out complex compute workloads. Many of these data exchange patterns are not something that can be straightforward to implement without queues. In summary, the queues are a crucial capability. Not having to worry about what can and cannot go into a queue is a strong argument for building unified queue-based messaging solutions that can handle any data of any size.

In this blog post, I’m going to implement a solution that will enable to use a Windows Azure queue for exchanging large messages. I also intend to simplify the way my solution interacts with Windows Azure queues by providing an abstraction layer built on top of the Queue Service API. This abstraction layer will make it easier to publish and consume instances of the application-specific entities as opposed to having to deal with byte arrays or strings which are the only types supported by the Queue Service API today. I am going to make extensive use of .NET generics, will take advantage of some value-add capabilities such as transparent stream compression and decompression as well as apply some known best practices such as handling intermittent faults in order to improve fault-tolerance of storage operations.

Design Considerations

As things stand today, a message that is larger than 8KB (after it’s serialized and encoded) cannot be stored in a Windows Azure queue. The client-side API will return an exception if you attempt to place a message larger than 8KB in a queue. The maximum allowed message size can be determined by inspecting the MaxMessageSize property from the CloudQueueMessage class. As of the writing of this post, the message size limit returned by this property is 8192.

Important

The maximum message size defined in CloudQueueMesssage.MaxMessageSize property is not reflective of the maximum allowed payload size. Messages are subject to Base64 encoding when they are put in a queue. The encoded payloads are always larger than their raw data. The Base64 encoding adds 25% overhead on average. As a result, the 8KB size limit effectively prohibits from storing any messages with payload larger than 6KB (75% of 8KB).

The truth is that 8KB can buy very little in terms of today’s storage demands. Even though it’s the limit for a single queue item, it can be deemed prohibitive for certain types of messages, especially those that cannot be broken down into smaller chunks. From a developer perspective, worrying about whether a given message can be accommodated on a queue doesn’t help my productivity. At the end of the day, the goal is to get my application data to flow between producers and consumers in the most efficient way, regardless of the data size. While one side calls Put (or Enqueue) and the other side invokes Get (or Dequeue) against a queue, the rest should theoretically occur auto-magically.

Overcoming the 8KB limitation in Windows Azure queues by employing a smart way of dealing with large messages is the key premises for the technical challenge elaborated in this post. This will come at the cost of some additional craftsmanship. In the modern world of commercial software development, any extra development efforts need to be wisely justified. I am going to justify the additional investments with the following design goals:

Support for very large messages through eliminating any restrictions imposed by the Queue Service API as it pertains to the message size.

Support for user-defined generic objects when publishing and consuming messages from a Windows Azure queue.

Transparent overflow into a configurable message store either blob container, distributed cache or other type of repository capable of storing large messages.

Transparent compression that is intended to increase cost-efficiency by minimizing the amount of storage space consumed by large messages.

Increased reliability in the form of extensive use of the transient condition handling best practices when performing queue operations.

The foundation for supporting large messages in size-constrained queues will be the following pattern. First, I check if a given message can be accommodated on a Windows Azure queue without performing any extra work. The way to determine whether a message can be safely stored on a queue without violating size constraints will be through a formula which I wrap into a helper function as follows:
/// <summary>
/// Provides helper methods to enable cloud application code to invoke common, globally accessible functions.
/// </summary>
public static class CloudUtility
{
    /// <summary>
    /// Verifies whether or not the specified message size can be accommodated in a Windows Azure queue.
    /// </summary>
    /// <param name="size">The message size value to be inspected.</param>
    /// <returns>True if the specified size can be accommodated in a Windows Azure queue, otherwise false.</returns>
    public static bool IsAllowedQueueMessageSize(long size)
    {
        return size >= 0 && size <= (CloudQueueMessage.MaxMessageSize - 1) / 4 * 3;
    }
}
If a message is under the 8KB size constraints, I should simply invoke the Queue Service API to enqueue the message “as is”. If the message size is in excess of the limitation in question, the data flow becomes quite interesting. The following flowchart visualizes the subsequent steps:

In summary, if a message cannot be accommodated on a queue due to its size, it overflows into a message store capable of storing large messages. A tiny metadata message is then created consisting of a reference to the item in the overflow store. Finally, the metadata message is put on a queue. Note that I always choose to compress a message before asserting its suitability for persistence in a queue. By doing this, I effectively expand the population of messages that can be queued without incurring the need to go into the overflow store. A good example is an XML document slightly larger than 8KB which, after serialization and compression is performed, becomes a perfect candidate to be simply put on a queue. You can modify this behavior in case the default compression is not desirable. It can be achieved by providing a custom serializer component elaborated in the next section.

There are several considerations that apply here, mainly from a cost perspective. As it can be noted in the above flowchart, I attempt to determine whether a large message can first overflow into Windows Azure AppFabric distributed cache (referred herein as AppFabric Cache for the sake of brevity). Since the usage of distributed cloud-based caching service is subject to a charge, the cache overflow path should be made optional. This is reflected on the flowchart.

In addition, there may be situations when the message is quite large and therefore is not suitable for being stored in a size-constrained distributed cache. As of the writing of this post, the maximum cache is 4GB. Therefore, we must take this into consideration and provide a failover path should we exceed cache capacity or quotas. The quotas come with eviction behavior that also needs to be accounted for.

Important

The use of the Windows Azure AppFabric Cache as an overflow store helps reduce latency and eliminate excessive storage transactions when exchanging a large number of messages in the model proposed above. It offers a highly available, distributed caching infrastructure capable of replicating and maintaining cached data in memory across multiple cache servers for durability. These benefits can be overweighed by the cache size limitation and costs associated with the service usage. It is therefore important to perform a cost-benefit analysis to assess the pros and cons of introducing the AppFabric Cache as an overflow store in certain scenarios.

Given that the distributed cache storage is limited, it is essential to set out some further rules that will enable the efficient use of the cache. In connection to this, one important recommendation needs to be explicitly called out:

Important

Due to specifics of its eviction behavior, AppFabric Cache does not offer a complete and ultimate durability when compared to the Windows Azure Blob Service. When used as an overflow store, AppFabric Cache is best suited when individual messages are volatile in nature and are under 8MB in size. The term “volatile” means that messages are published into and subsequently consumed from a Windows Azure queue as quickly as possible. The 8MB recommendation is due to the optimal cache item size that is configured in the AppFabric Cache by default.

I’m going to reflect the above recommendation in the code by providing a helper function that will determine whether or not the specified item size value can be considered as optimal when storing an item of the given size in the cache.
public static class CloudUtility
{
    private static readonly long optimalCacheItemSize = 8 * 1024 * 1024;

    /// <summary>
    /// Determines whether the specified value can be considered as optimal when storing an item of a given size in the cache.
    /// </summary>
    /// <param name="size">The item size value to be inspected.</param>
    /// <returns>True if the specified size can be considered as optimal, otherwise false.</returns>
    public static bool IsOptimalCacheItemSize(long size)
    {
        return size >= 0 && size <= optimalCacheItemSize;
    }
}
Now that some initial pre-requisites are considered, it’s time to switch over to the consumer side and take a look at the implementation pattern for retrieving large messages from a queue. First, let’s visualize the process flow for the purposes of facilitating overall understanding:

To summarize the above flow, a message of an unknown type is fetched from a queue and compared against a metadata message type. If it is not a metadata message, the flow continues with decompression logic, so that the original message can be correctly reconstructed before being presented to the consumer. By contrast, if it was in fact a metadata message, it is inspected to determine the type of overflow store that was used for storing the actual message. If it is identified as a message stored in the cache, the respective AppFabric Cache API is invoked and the real message will be fetched before being decompressed and returned to the consumer. In case the real message was put into a blob container, the Blob Service API will be targeted to retrieve the real message from the blob entity, decompressed and handed back to the caller.

In addition to handling Enqueue and Dequeue operations for large messages, there is a need to make sure that all overflowed payloads are removed from their respective overflow message stores upon the consumer’s request. To accomplish this, one of the potential implementation patterns is to couple the removal process with the Delete operation when it’s being invoked for a given message. The visual representation of this operation can be depicted as follows:

Before we start implementing the patterns mentioned above, one last consideration worth making is the definition of a message. What would be considered a message, and what forms will it manifest itself in? Would it be a byte array, a stream of data, a simple type like a string, or a complex application-specific object which the developer implements as part of the solution object model? I truly believe that this is the area where we should not constrain ourselves. Let’s just assume that a message is of generic type <T> meaning it’s anything the developer wishes to use. You will see that the end implementation will naturally unfold itself around this idea.

Putting all together, the following diagram summarizes all the three possible travel paths which are accounted for in the above design:

At this point, there seem to be enough input to start bringing the technical design to life. From this point onwards, I will switch the focus to the source code required to implement the patterns discussed above.

Technical Implementation

To follow along, download the full sample code from the MSDN Code Gallery. The sample is shipped as part of a larger end-to-end reference implementation which is powered by the patterns discussed in this post. Once downloaded and unzipped, navigate to the Azure.Services.Framework project under Contoso.Cloud.Integration.Azure and expand the Storage folder. This location contains all the main code artifacts discussed below.

As noted at the beginning, the original idea was to abstract the way a cloud application interacts with Window Azure queues. I approach this requirement by providing a contract that governs the main operations supported by my custom storage abstraction layer. The programming interface through which the contract surfaces to consumers is shown below. Note that I intentionally omitted a few infrastructure-level functions from the code snippet below such as creation and deletion of queues since these do not add significant value at this time.
/// <summary>
/// Defines a generics-aware abstraction layer for Windows Azure Queue storage.
/// </summary>
public interface ICloudQueueStorage : IDisposable
{
    /// <summary>
    /// Puts a single message on a queue.
    /// </summary>
    /// <typeparam name="T">The type of the payload associated with the message.</typeparam>
    /// <param name="queueName">The target queue name on which message will be placed.</param>
    /// <param name="message">The payload to be put into a queue.</param>
    void Put<T>(string queueName, T message);

    /// <summary>
    /// Retrieves a single message from the specified queue and applies the default visibility timeout.
    /// </summary>
    /// <typeparam name="T">The type of the payload associated with the message.</typeparam>
    /// <param name="queueName">The name of the source queue.</param>
    /// <returns>An instance of the object that has been fetched from the queue.</returns>
    T Get<T>(string queueName);

    /// <summary>
    /// Gets a collection of messages from the specified queue and applies the specified visibility timeout.
    /// </summary>
    /// <typeparam name="T">The type of the payload associated with the message.</typeparam>
    /// <param name="queueName">The name of the source queue.</param>
    /// <param name="count">The number of messages to retrieve.</param>
    /// <param name="visibilityTimeout">The timeout during which retrieved messages will remain invisible on the queue.</param>
    /// <returns>The list of messages retrieved from the specified queue.</returns>
    IEnumerable<T> Get<T>(string queueName, int count, TimeSpan visibilityTimeout);

    /// <summary>
    /// Gets a collection of messages from the specified queue and applies the default visibility timeout.
    /// </summary>
    /// <typeparam name="T">The type of the payload associated with the message.</typeparam>
    /// <param name="queueName">The name of the source queue.</param>
    /// <param name="count">The number of messages to retrieve.</param>
    /// <returns>The list of messages retrieved from the specified queue.</returns>
    IEnumerable<T> Get<T>(string queueName, int count);

    /// <summary>
    /// Deletes a single message from a queue.
    /// </summary>
    /// <typeparam name="T">The type of the payload associated with the message.</typeparam>
    /// <param name="message">The message to be deleted from a queue.</param>
    /// <returns>A flag indicating whether or not the specified message has actually been deleted.</returns>
    bool Delete<T>(T message);
}
There is also a need for one extra contract (interface) which will abstract access to the large message overflow store. This contract will be implemented by two components, one for each overflow store (blob storage and distributed cache) that I intend to support. The contract will be comprised of the following operations:
/// <summary>
/// Defines a generics-aware abstraction layer for Windows Azure Blob storage.
/// </summary>
public interface ICloudBlobStorage : IDisposable
{
    /// <summary>
    /// Puts a blob into the underlying storage, overwrites if the blob with the same name already exists.
    /// </summary>
    /// <typeparam name="T">The type of the payload associated with the blob.</typeparam>
    /// <param name="containerName">The target blob container name into which a blob will be stored.</param>
    /// <param name="blobName">The custom name associated with the blob.</param>
    /// <param name="blob">The blob's payload.</param>
    /// <returns>True if the blob was successfully put into the specified container, otherwise false.</returns>
    bool Put<T>(string containerName, string blobName, T blob);

    /// <summary>
    /// Puts a blob into the underlying storage. If the blob with the same name already exists, overwrite behavior can be applied. 
    /// </summary>
    /// <typeparam name="T">The type of the payload associated with the blob.</typeparam>
    /// <param name="containerName">The target blob container name into which a blob will be stored.</param>
    /// <param name="blobName">The custom name associated with the blob.</param>
    /// <param name="blob">The blob's payload.</param>
    /// <param name="overwrite">The flag indicating whether or not overwriting the existing blob is permitted.</param>
    /// <returns>True if the blob was successfully put into the specified container, otherwise false.</returns>
    bool Put<T>(string containerName, string blobName, T blob, bool overwrite);

    /// <summary>
    /// Retrieves a blob by its name from the underlying storage.
    /// </summary>
    /// <typeparam name="T">The type of the payload associated with the blob.</typeparam>
    /// <param name="containerName">The target blob container name from which the blob will be retrieved.</param>
    /// <param name="blobName">The custom name associated with the blob.</param>
    /// <returns>An instance of <typeparamref name="T"/> or default(T) if the specified blob was not found.</returns>
    T Get<T>(string containerName, string blobName);

    /// <summary>
    /// Deletes the specified blob.
    /// </summary>
    /// <param name="containerName">The target blob container name from which the blob will be deleted.</param>
    /// <param name="blobName">The custom name associated with the blob.</param>
    /// <returns>True if the blob was deleted, otherwise false.</returns>
    bool Delete(string containerName, string blobName);
}
As it can be noted from the above code snippets, both contracts heavily rely on generic type <T>. It enables you to tailor the message type to any .NET type of your choice. I will however have to handle some extreme use cases, namely types that require special treatment, such as streams. I will expand on this later on.

Regardless of the message type chosen, one important requirement will apply; the object type that represents a message on a queue must be serializable. All objects passing through the storage abstraction layer will be subject to serialization before they land on a queue or overflow store and deserialization upon retrieval. In my implementation, serialization and deserialization will also be coupled with compression and decompression, respectively. This approach will help increase efficiency from a cost and bandwidth perspective. The cost-related benefit comes from the fact that compressed large messages inherently consume less storage, resulting in a decrease in storage costs. The bandwidth efficiency arises from savings on payload size thanks to compression, which in turn makes payloads smaller on the wire as they flow to and from the Windows Azure storage or distributed cache infrastructure.

The requirement for serialization and deserialization will be declared in a specialized interface. Any component implementing this interface must provide the specific compression, serialization, deserialization, and decompression functionality. An example of this interface is shown below:
/// <summary>
/// Defines a contract that must be supported by a component which performs serialization and 
/// deserialization of storage objects such as Azure queue items, blobs and table entries.
/// </summary>
public interface ICloudStorageEntitySerializer
{
    /// <summary>
    /// Serializes the object to the specified stream.
    /// </summary>
    /// <param name="instance">The object instance to be serialized.</param>
    /// <param name="target">The destination stream into which the serialized object will be written.</param>
    void Serialize(object instance, Stream target);

    /// <summary>
    /// Deserializes the object from specified data stream.
    /// </summary>
    /// <param name="source">The source stream from which serialized object will be consumed.</param>
    /// <param name="type">The type of the object that will be deserialized.</param>
    /// <returns>The deserialized object instance.</returns>
    object Deserialize(Stream source, Type type);
}
As it relates to compression and decompression, I’m going to opt-in to the use of the DeflateStream component in the .NET Framework. This class represents the Deflate algorithm, an industry standard RFC compliant algorithm for lossless file compression and decompression. In comparison to its neighbor, the GZipStream class, the former produces more optimal compressed images and generally delivers better performance. By contrast, the GZipStream class uses the GZIP data format, which includes a cyclic redundancy check (CRC) value for detecting data corruption. Behind the scene, the GZIP data format uses the same compression algorithm as the DeflateStream class. In summary, GZipStream = DeflateStream + the cost of calculating and storing CRC checksums.

My implementation of the above contract is included below. Note that compression algorithms can be easily toggled by replacing DeflateStream class with GZipStream and vice versa.
/// <summary>
/// Provides a default implementation of ICloudStorageEntitySerializer which performs serialization and 
/// deserialization of storage objects such as Azure queue items, blobs and table entries.
/// </summary>
internal sealed class CloudStorageEntitySerializer : ICloudStorageEntitySerializer
{
    /// <summary>
    /// Serializes the object to the specified stream.
    /// </summary>
    /// <param name="instance">The object instance to be serialized.</param>
    /// <param name="target">The destination stream into which the serialized object will be written.</param>
    public void Serialize(object instance, Stream target)
    {
        Guard.ArgumentNotNull(instance, "instance");
        Guard.ArgumentNotNull(target, "target");

        XDocument xmlDocument = null;
        XElement xmlElement = null;
        XmlDocument domDocument = null;
        XmlElement domElement = null;

        if ((xmlElement = (instance as XElement)) != null)
        {
            // Handle XML element serialization using separate technique.
            SerializeXml<XElement>(xmlElement, target, (xml, writer) => { xml.Save(writer); });
        }
        else if ((xmlDocument = (instance as XDocument)) != null)
        {
            // Handle XML document serialization using separate technique.
            SerializeXml<XDocument>(xmlDocument, target, (xml, writer) => { xml.Save(writer); });
        }
        else if ((domDocument = (instance as XmlDocument)) != null)
        {
            // Handle XML DOM document serialization using separate technique.
            SerializeXml<XmlDocument>(domDocument, target, (xml, writer) => { xml.Save(writer); });
        }
        else if ((domElement = (instance as XmlElement)) != null)
        {
            // Handle XML DOM element serialization using separate technique.
            SerializeXml<XmlElement>(domElement, target, (xml, writer) => { xml.WriteTo(writer); });
        }
        else
        {
            var serializer = GetXmlSerializer(instance.GetType());

            using (var compressedStream = new DeflateStream(target, CompressionMode.Compress, true))
            using (var xmlWriter = XmlDictionaryWriter.CreateBinaryWriter(compressedStream, null, null, false))
            {
                serializer.WriteObject(xmlWriter, instance);
            }
        }
    }

    /// <summary>
    /// Deserializes the object from specified data stream.
    /// </summary>
    /// <param name="source">The source stream from which serialized object will be consumed.</param>
    /// <param name="type">The type of the object that will be deserialized.</param>
    /// <returns>The deserialized object instance.</returns>
    public object Deserialize(Stream source, Type type)
    {
        Guard.ArgumentNotNull(source, "source");
        Guard.ArgumentNotNull(type, "type");

        if (type == typeof(XElement))
        {
            // Handle XML element deserialization using separate technique.
            return DeserializeXml<XElement>(source, (reader) => { return XElement.Load(reader); });
        }
        else if (type == typeof(XDocument))
        {
            // Handle XML document deserialization using separate technique.
            return DeserializeXml<XDocument>(source, (reader) => { return XDocument.Load(reader); });
        }
        else if (type == typeof(XmlDocument))
        {
            // Handle XML DOM document deserialization using separate technique.
            return DeserializeXml<XmlDocument>(source, (reader) => { var xml = new XmlDocument(); xml.Load(reader); return xml; });
        }
        else if (type == typeof(XmlElement))
        {
            // Handle XML DOM element deserialization using separate technique.
            return DeserializeXml<XmlElement>(source, (reader) => { var xml = new XmlDocument(); xml.Load(reader); return xml.DocumentElement; });
        }
        else
        {
            var serializer = GetXmlSerializer(type);

            using (var compressedStream = new DeflateStream(source, CompressionMode.Decompress, true))
            using (var xmlReader = XmlDictionaryReader.CreateBinaryReader(compressedStream, XmlDictionaryReaderQuotas.Max))
            {
                return serializer.ReadObject(xmlReader);
            }
        }
    }

    private XmlObjectSerializer GetXmlSerializer(Type type)
    {
        if (FrameworkUtility.GetDeclarativeAttribute<DataContractAttribute>(type) != null)
        {
            return new DataContractSerializer(type);
        }
        else
        {
            return new NetDataContractSerializer();
        }
    }

    private void SerializeXml<T>(T instance, Stream target, Action<T, XmlWriter> serializeAction)
    {
        using (var compressedStream = new DeflateStream(target, CompressionMode.Compress, true))
        using (var xmlWriter = XmlDictionaryWriter.CreateBinaryWriter(compressedStream, null, null, false))
        {
            serializeAction(instance, xmlWriter);

            xmlWriter.Flush();
            compressedStream.Flush();
        }
    }

    private T DeserializeXml<T>(Stream source, Func<XmlReader, T> deserializeAction)
    {
        using (var compressedStream = new DeflateStream(source, CompressionMode.Decompress, true))
        using (var xmlReader = XmlDictionaryReader.CreateBinaryReader(compressedStream, XmlDictionaryReaderQuotas.Max))
        {
            return deserializeAction(xmlReader);
        }
    }
}
One of the powerful capabilities in the CloudStorageEntitySerializer implementation is the ability to apply special treatment when handling XML documents of both flavors: XmlDocument and XDocument. The other area worth highlighting is the optimal serialization and deserialization of the XML data. Here I decided to take advantage of the XmlDictionaryReader and XmlDictionaryWriter classes which are known to .NET developers as a superb choice when it comes to performing efficient serialization and deserialization of XML payloads using the .NET Binary XML format.

Switching topics, the decision as to the type of overflow message store will be the responsibility of the consumer which calls into my custom storage abstraction layer. Along these lines, I’m going to provide an option to select the desired message store type by adding the following constructors in the type implementing the ICloudQueueStorage interface:
/// <summary>
/// Provides reliable generics-aware access to the Windows Azure Queue storage.
/// </summary>
public sealed class ReliableCloudQueueStorage : ICloudQueueStorage
{
    private readonly RetryPolicy retryPolicy;
    private readonly CloudQueueClient queueStorage;
    private readonly ICloudStorageEntitySerializer dataSerializer;
    private readonly ICloudBlobStorage overflowStorage;
    private readonly ConcurrentDictionary<object, InflightMessageInfo> inflightMessages;

    /// <summary>
    /// Initializes a new instance of the <see cref="ReliableCloudQueueStorage"/> class using the specified storage account information,
    /// custom retry policy and custom implementation of the large message overflow store.
    /// </summary>
    /// <param name="storageAccountInfo">The storage account that is projected through this component.</param>
    /// <param name="retryPolicy">The specific retry policy that will ensure reliable access to the underlying storage.</param>
    /// <param name="overflowStorage">The component implementing overflow store that will be used for persisting large
    /// messages that cannot be accommodated in a queue due to message size constraints.</param>
    public ReliableCloudQueueStorage(StorageAccountInfo storageAccountInfo, RetryPolicy retryPolicy, ICloudBlobStorage overflowStorage)
        : this(storageAccountInfo, retryPolicy, new CloudStorageEntitySerializer(), overflowStorage)
    {
    }

    /// <summary>
    /// Initializes a new instance of the <see cref="ReliableCloudQueueStorage"/> class using the specified storage account information,
    /// custom retry policy, custom implementation of <see cref="ICloudStorageEntitySerializer"/> interface and custom implementation of
    /// the large message overflow store.
    /// </summary>
    /// <param name="storageAccountInfo">The storage account that is projected through this component.</param>
    /// <param name="retryPolicy">The specific retry policy that will ensure reliable access to the underlying storage.</param>
    /// <param name="dataSerializer">The component which performs serialization and deserialization of storage objects.</param>
    /// <param name="overflowStorage">The component implementing overflow store that will be used for persisting large messages that
    /// cannot be accommodated in a queue due to message size constraints.</param>
    public ReliableCloudQueueStorage(StorageAccountInfo storageAccountInfo, RetryPolicy retryPolicy, ICloudStorageEntitySerializer dataSerializer, ICloudBlobStorage overflowStorage)
    {
        Guard.ArgumentNotNull(storageAccountInfo, "storageAccountInfo");
        Guard.ArgumentNotNull(retryPolicy, "retryPolicy");
        Guard.ArgumentNotNull(dataSerializer, "dataSerializer");
        Guard.ArgumentNotNull(overflowStorage, "overflowStorage");

        this.retryPolicy = retryPolicy;
        this.dataSerializer = dataSerializer;
        this.overflowStorage = overflowStorage;

        CloudStorageAccount storageAccount = new CloudStorageAccount(new StorageCredentialsAccountAndKey(storageAccountInfo.AccountName, storageAccountInfo.AccountKey), true);
        this.queueStorage = storageAccount.CreateCloudQueueClient();

        // Configure the Queue storage not to enforce any retry policies since this is something that we will be dealing ourselves.
        this.queueStorage.RetryPolicy = RetryPolicies.NoRetry();

        this.inflightMessages = new ConcurrentDictionary<object, InflightMessageInfo>(Environment.ProcessorCount * 4, InflightMessageQueueInitialCapacity);
    }
}
The above constructors are not performing any complex work; they simply initialize the internal members and configure the client component that will be accessing a Windows Azure queue. It’s worth noting however that I explicitly tell the queue client not to enforce any retry policy. In order to provide a robust and reliable storage abstraction layer, I need more granular control over transient issues when performing operations against Windows Azure queues. Therefore, there will be a separate component that recognizes and is able to handle a much larger variety of intermittent faults.

Let’s now take a look at the internals of the ReliableCloudQueueStorage class which I drafted above. Specifically, let’s review its implementation of the Put operation since this is the location of the transparent overflow into a large message store.
/// <summary>
/// Puts a single message on a queue.
/// </summary>
/// <typeparam name="T">The type of the payload associated with the message.</typeparam>
/// <param name="queueName">The target queue name on which message will be placed.</param>
/// <param name="message">The payload to be put into a queue.</param>
public void Put<T>(string queueName, T message)
{
    Guard.ArgumentNotNullOrEmptyString(queueName, "queueName");
    Guard.ArgumentNotNull(message, "message");

    // Obtain a reference to the queue by its name. The name will be validated against compliance with storage resource names.
    var queue = this.queueStorage.GetQueueReference(CloudUtility.GetSafeContainerName(queueName));

    CloudQueueMessage queueMessage = null;

    // Allocate a memory buffer into which messages will be serialized prior to being put on a queue.
    using (MemoryStream dataStream = new MemoryStream(Convert.ToInt32(CloudQueueMessage.MaxMessageSize)))
    {
	// Perform serialization of the message data into the target memory buffer.
	this.dataSerializer.Serialize(message, dataStream);

	// Reset the position in the buffer as we will be reading its content from the beginning.
	dataStream.Seek(0, SeekOrigin.Begin);

	// First, determine whether the specified message can be accommodated on a queue.
	if (CloudUtility.IsAllowedQueueMessageSize(dataStream.Length))
	{
		queueMessage = new CloudQueueMessage(dataStream.ToArray());
	}
	else
	{
		// Create an instance of a large queue item metadata message.
		LargeQueueMessageInfo queueMsgInfo = LargeQueueMessageInfo.Create(queueName);

		// Persist the stream of data that represents a large message into the overflow message store.
		this.overflowStorage.Put<Stream>(queueMsgInfo.ContainerName, queueMsgInfo.BlobReference, dataStream);

		// Invoke the Put operation recursively to enqueue the metadata message.
		Put<LargeQueueMessageInfo>(queueName, queueMsgInfo);
	}
    }

    // Check if a message is available to be put on a queue.
    if (queueMessage != null)
    {
	Put(queue, queueMessage);
    }
}
I highlighted the most significant lines where interesting things happen. The inline comments are intended to provide contextual description as to what happens at every major step, so I will save valuable screen space by not re-iterating some obvious things again.

A new code artifact that has just manifested itself in the snippet above is the LargeQueueMessageInfo class. This custom type is ultimately our metadata message that describes the location of a large message. This class is marked as internal as it’s not intended to be visible to anyone outside the storage abstraction layer implementation. The class is defined as follows:
/// <summary>
/// Implements an object holding metadata related to a large message which is stored in 
/// the overflow message store such as Windows Azure blob container.
/// </summary>
[DataContract(Namespace = WellKnownNamespace.DataContracts.General)]
internal sealed class LargeQueueMessageInfo
{
    private const string ContainerNameFormat = "LargeMsgCache-{0}";

    /// <summary>
    /// Returns the name of the blob container holding the large message payload.
    /// </summary>
    [DataMember]
    public string ContainerName { get; private set; }

    /// <summary>
    /// Returns the unique reference to a blob holding the large message payload.
    /// </summary>
    [DataMember]
    public string BlobReference { get; private set; } 

    /// <summary>
    /// The default constructor is inaccessible, the object needs to be instantiated using its Create method.
    /// </summary>
    private LargeQueueMessageInfo() { }

    /// <summary>
    /// Creates a new instance of the large message metadata object and allocates a globally unique blob reference.
    /// </summary>
    /// <param name="queueName">The name of the Windows Azure queue on which a reference to the large message will be stored.</param>
    /// <returns>The instance of the large message metadata object.</returns>
    public static LargeQueueMessageInfo Create(string queueName)
    {
        Guard.ArgumentNotNullOrEmptyString(queueName, "queueName");

        return new LargeQueueMessageInfo() { ContainerName = String.Format(ContainerNameFormat, queueName), BlobReference = Guid.NewGuid().ToString("N") };
    }
}
Moving forward, I need to implement a large message overflow store that will leverage the Windows Azure Blob Storage service. As I pointed out earlier, this component must support the ICloudBlobStorage interface that will be consumed by the ReliableCloudQueueStorage component to relay messages into the ICloudBlobStorage implementation whenever these cannot be accommodated on a queue due to message size limitation. To set the stage for the next steps, I will include the constructor’s implementation only:
/// <summary>
/// Implements reliable generics-aware layer for Windows Azure Blob storage.
/// </summary>
public class ReliableCloudBlobStorage : ICloudBlobStorage
{
    private readonly RetryPolicy retryPolicy;
    private readonly CloudBlobClient blobStorage;
    private readonly ICloudStorageEntitySerializer dataSerializer;

    /// <summary>
    /// Initializes a new instance of the ReliableCloudBlobStorage class using the specified storage account information.
    /// Assumes the default use of RetryPolicy.DefaultExponential retry policy and default entity serializer.
    /// </summary>
    /// <param name="storageAccountInfo">The access credentials for Windows Azure storage account.</param>
    public ReliableCloudBlobStorage(StorageAccountInfo storageAccountInfo)
        : this(storageAccountInfo, new RetryPolicy<StorageTransientErrorDetectionStrategy>(RetryPolicy.DefaultClientRetryCount, RetryPolicy.DefaultMinBackoff, RetryPolicy.DefaultMaxBackoff, RetryPolicy.DefaultClientBackoff))
    {
    }

    /// <summary>
    /// Initializes a new instance of the ReliableCloudBlobStorage class using the specified storage account info and retry policy.
    /// </summary>
    /// <param name="storageAccountInfo">The access credentials for Windows Azure storage account.</param>
    /// <param name="retryPolicy">The custom retry policy that will ensure reliable access to the underlying blob storage.</param>
    public ReliableCloudBlobStorage(StorageAccountInfo storageAccountInfo, RetryPolicy retryPolicy)
        : this(storageAccountInfo, retryPolicy, new CloudStorageEntitySerializer())
    {
    }

    /// <summary>
    /// Initializes a new instance of the ReliableCloudBlobStorage class using the specified storage account info, custom retry
    /// policy and custom implementation of ICloudStorageEntitySerializer interface.
    /// </summary>
    /// <param name="storageAccountInfo">The access credentials for Windows Azure storage account.</param>
    /// <param name="retryPolicy">The custom retry policy that will ensure reliable access to the underlying storage.</param>
    /// <param name="dataSerializer">The component which performs serialization/deserialization of storage objects.</param>
    public ReliableCloudBlobStorage(StorageAccountInfo storageAccountInfo, RetryPolicy retryPolicy, ICloudStorageEntitySerializer dataSerializer)
    {
        Guard.ArgumentNotNull(storageAccountInfo, "storageAccountInfo");
        Guard.ArgumentNotNull(retryPolicy, "retryPolicy");
        Guard.ArgumentNotNull(dataSerializer, "dataSerializer");

        this.retryPolicy = retryPolicy;
        this.dataSerializer = dataSerializer;

        CloudStorageAccount storageAccount = new CloudStorageAccount(new StorageCredentialsAccountAndKey(storageAccountInfo.AccountName, storageAccountInfo.AccountKey), true);
        this.blobStorage = storageAccount.CreateCloudBlobClient();

        // Configure the Blob storage not to enforce any retry policies since this is something that we will be dealing ourselves.
        this.blobStorage.RetryPolicy = RetryPolicies.NoRetry();

        // Disable parallelism in blob upload operations to reduce the impact of multiple concurrent threads on parallel upload feature.
        this.blobStorage.ParallelOperationThreadCount = 1;
    }
}
The last highlighted line is worth a couple of remarks. While performing stress-testing of the storage abstraction layer discussed in this post, I have observed a non-deterministic behavior manifesting itself in the form of a runtime exception of type System.ArgumentException. The error message suggested that “Waithandle array may not be empty” which was obviously not sufficiently descriptive. It turned out that since I was stressing the blob storage from multiple threads and multiple worker roles, I encountered a high-concurrency issue. This concurrency issue can be attributed to the way blobs that exceed the SingleBlobUploadThresholdInBytes value are handled by the Blob Service Client APIs internally. In essence, my large messages were automatically divided into blocks that were uploaded individually and assembled into the complete blob by the service. The ParallelOperationThreadCount property specifies how many blocks may be uploaded simultaneously. With many threads hitting my storage abstraction layer, it became apparent that the default parallelization behavior was causing undesirable overhead. The Windows Azure Storage Team has published a blog post detailing the recommendations for handling parallel uploads.

Earlier in this post, I have shown the implementation of the Put operation which ensures that small messages will always be placed on a queue whereas large messages will be transparently routed into the overflow store. For the sake of continuity, let’s now review the mechanics behind the counterpart Put operation implemented by the overflow store.
/// <summary>
/// Puts a blob into the underlying storage, overwrites the existing blob if the blob with the same name already exists.
/// </summary>
/// <typeparam name="T">The type of the payload associated with the blob.</typeparam>
/// <param name="containerName">The target blob container name into which a blob will be stored.</param>
/// <param name="blobName">The custom name associated with the blob.</param>
/// <param name="blob">The blob's payload.</param>
/// <returns>True if the blob was successfully put into the specified container, otherwise false.</returns>
public bool Put<T>(string containerName, string blobName, T blob)
{
    return Put<T>(containerName, blobName, blob, true);
}

/// <summary>
/// Puts a blob into the underlying storage. If the blob with the same name already exists, overwrite behavior can be customized. 
/// </summary>
/// <typeparam name="T">The type of the payload associated with the blob.</typeparam>
/// <param name="containerName">The target blob container name into which a blob will be stored.</param>
/// <param name="blobName">The custom name associated with the blob.</param>
/// <param name="blob">The blob's payload.</param>
/// <param name="overwrite">The flag indicating whether or not overwriting the existing blob is permitted.</param>
/// <returns>True if the blob was successfully put into the specified container, otherwise false.</returns>
public bool Put<T>(string containerName, string blobName, T blob, bool overwrite)
{
    string eTag;
    return Put<T>(containerName, blobName, blob, true, null, out eTag);
}

/// Private methods do not have to have inline XML comment tags. Hence, I saved myself some time.
private bool Put<T>(string containerName, string blobName, T blob, bool overwrite, string expectedEtag, out string actualEtag)
{
    Guard.ArgumentNotNullOrEmptyString(containerName, "containerName");
    Guard.ArgumentNotNullOrEmptyString(blobName, "blobName");
    Guard.ArgumentNotNull(blob, "blob");

    var callToken = TraceManager.CloudStorageComponent.TraceIn(containerName, blobName, overwrite, expectedEtag);

    // Verify whether or not the specified blob is already of type Stream.
    Stream blobStream = IsStreamType(blob.GetType()) ? blob as Stream : null;
    Stream blobData = null;
    actualEtag = null;

    try
    {
        // Are we dealing with a stream already? If yes, just use it as is.
        if (blobStream != null)
        {
            blobData = blobStream;
        }
        else
        {
            // The specified blob is something else rather than a Stream, we perform serialization of T into a new stream instance.
            blobData = new MemoryStream();
            this.dataSerializer.Serialize(blob, blobData);
        }

        var container = this.blobStorage.GetContainerReference(CloudUtility.GetSafeContainerName(containerName));
        StorageErrorCode lastErrorCode = StorageErrorCode.None;

        Func<string> uploadAction = () =>
        {
            var cloudBlob = container.GetBlobReference(blobName);
            return UploadBlob(cloudBlob, blobData, overwrite, expectedEtag);
        };

        try
        {
            // First attempt - perform upload and let the UploadBlob method handle any retry conditions.
            string eTag = uploadAction();

            if (!String.IsNullOrEmpty(eTag))
            {
                actualEtag = eTag;
                return true;
            }
        }
        catch (StorageClientException ex)
        {
            lastErrorCode = ex.ErrorCode;

            if (!(lastErrorCode == StorageErrorCode.ContainerNotFound || lastErrorCode == StorageErrorCode.ResourceNotFound || lastErrorCode == StorageErrorCode.BlobAlreadyExists))
            {
                // Anything other than "not found" or "already exists" conditions will be considered as a runtime error.
                throw;
            }
        }

        if (lastErrorCode == StorageErrorCode.ContainerNotFound)
        {
            // Failover action #1: create the target container and try again. This time, use a retry policy to wrap calls to the
            // UploadBlob method.
            string eTag = this.retryPolicy.ExecuteAction<string>(() =>
            {
                CreateContainer(containerName);
                return uploadAction();
            });

            return !String.IsNullOrEmpty(actualEtag = eTag);
        }

        if (lastErrorCode == StorageErrorCode.BlobAlreadyExists && overwrite)
        {
            // Failover action #2: Overwrite was requested but BlobAlreadyExists has still been returned.
            // Delete the original blob and try to upload again.
            string eTag = this.retryPolicy.ExecuteAction<string>(() =>
            {
                var cloudBlob = container.GetBlobReference(blobName);
                cloudBlob.DeleteIfExists();

                return uploadAction();
            });

            return !String.IsNullOrEmpty(actualEtag = eTag);
        }
    }
    finally
    {
        // Only dispose the blob data stream if it was newly created.
        if (blobData != null && null == blobStream)
        {
            blobData.Dispose();
        }

        TraceManager.CloudStorageComponent.TraceOut(callToken, actualEtag);
    }

    return false;
}
In summary, the above code takes a blob of type <T> and first checks if this is already a serialized image of a message in the form of a Stream object. All large messages that are relayed to the overflow storage by the ReliableCloudQueueStorage component will arrive as streams ready for persistence. Next, the UploadBlob action is invoked, which in turn calls into Blob Service Client API, specifically its UploadFromStream operation. If a large message blob fails to upload successfully, the code inspects the error returned by the Blob Service and provides a failover path for 2 conditions: ContainerNotFound and BlobAlreadyExists. In the event that the target blob container is not found, the code will attempt to create the missing container. It performs this action within a retry-aware scope to improve reliability and increase resilience to transient failures. The second failover path is intended to handle a situation where a blob with the same name already exists. The code will remove the existing blob, provided the overwrite behavior is enabled. After removal, the upload of the new blob will be retried. Again, this operation is performed inside a retry-aware scope for increased reliability.

Now that I can store large messages in a blob container, it’s time to design another implementation of the ICloudBlobStorage interface that will leverage the Windows Azure AppFabric Caching Service. For consistency, let’s start off with its constructors:
/// <summary>
/// Implements reliable generics-aware layer for Windows Azure AppFabric Caching Service.
/// </summary>
public class ReliableCloudCacheStorage : ICloudBlobStorage
{
    private readonly RetryPolicy retryPolicy;
    private readonly ICloudStorageEntitySerializer dataSerializer;
    private readonly DataCacheFactory cacheFactory;
    private readonly DataCache cache;

    /// <summary>
    /// Initializes a new instance of the ReliableCloudCacheStorage class using the specified caching service endpoint details.
    /// Assumes the default use of RetryPolicy.DefaultExponential retry policy and default implementation of entity serializer component.
    /// </summary>
    /// <param name="endpointInfo">The endpoint details for Windows Azure AppFabric Caching Service.</param>
    public ReliableCloudCacheStorage(CachingServiceEndpointInfo endpointInfo)
        : this(endpointInfo, new RetryPolicy<CacheTransientErrorDetectionStrategy>(RetryPolicy.DefaultClientRetryCount, RetryPolicy.DefaultMinBackoff, RetryPolicy.DefaultMaxBackoff, RetryPolicy.DefaultClientBackoff))
    {
    }

    /// <summary>
    /// Initializes a new instance of the ReliableCloudCacheStorage class using the specified caching service endpoint
    /// and a custom retry policy.
    /// </summary>
    /// <param name="endpointInfo">The endpoint details for Windows Azure AppFabric Caching Service.</param>
    /// <param name="retryPolicy">The custom retry policy that will ensure reliable access to the Caching Service.</param>
    public ReliableCloudCacheStorage(CachingServiceEndpointInfo endpointInfo, RetryPolicy retryPolicy)
        : this(endpointInfo, retryPolicy, new CloudStorageEntitySerializer())
    {
    }

    /// <summary>
    /// Initializes a new instance of the ReliableCloudCacheStorage class using the specified storage account information
    /// custom retry policy and custom implementation of ICloudStorageEntitySerializer interface.
    /// </summary>
    /// <param name="endpointInfo">The endpoint details for Windows Azure AppFabric Caching Service.</param>
    /// <param name="retryPolicy">The custom retry policy that will ensure reliable access to the Caching Service.</param>
    /// <param name="dataSerializer">The component which performs custom serialization and deserialization of cache items.</param>
    public ReliableCloudCacheStorage(CachingServiceEndpointInfo endpointInfo, RetryPolicy retryPolicy, ICloudStorageEntitySerializer dataSerializer)
    {
        Guard.ArgumentNotNull(endpointInfo, "endpointInfo");
        Guard.ArgumentNotNull(retryPolicy, "retryPolicy");
        Guard.ArgumentNotNull(dataSerializer, "dataSerializer");

        this.retryPolicy = retryPolicy;
        this.dataSerializer = dataSerializer;

        var cacheServers = new List<DataCacheServerEndpoint>(1);
        cacheServers.Add(new DataCacheServerEndpoint(endpointInfo.ServiceHostName, endpointInfo.CachePort));

        var cacheConfig = new DataCacheFactoryConfiguration()
        {
            Servers = cacheServers,
            MaxConnectionsToServer = 1,
            IsCompressionEnabled = false,
            SecurityProperties = new DataCacheSecurity(endpointInfo.SecureAuthenticationToken, endpointInfo.SslEnabled),
            // The ReceiveTimeout value has been modified as per recommendations provided in
            // http://blogs.msdn.com/b/akshar/archive/2011/05/01/azure-appfabric-caching-errorcode-lt-errca0017-gt-substatus-lt-es0006-gt-what-to-do.aspx
            TransportProperties = new DataCacheTransportProperties() { ReceiveTimeout = TimeSpan.FromSeconds(45) }
        };

        this.cacheFactory = new DataCacheFactory(cacheConfig);
        this.cache = this.retryPolicy.ExecuteAction<DataCache>(() =>
        {
            return this.cacheFactory.GetDefaultCache();
        });
    }
}
If you recall from earlier considerations, one of the key technical design decisions was to take advantage of both the Blob Service and Caching Service for storing large messages. The cache option is mostly suited for transient objects not exceeding the recommended payload size of 8MB. The blob option is essentially for everything else. Overall, this decision introduces the need for a hybrid overflow store. The foundation for building a hybrid store is already in the codebase. It’s just the matter of marrying the existing artifacts together as follows:
/// <summary>
/// Implements reliable generics-aware storage layer combining Windows Azure Blob storage and
/// Windows Azure AppFabric Cache in a hybrid mode.
/// </summary>
public class ReliableHybridBlobStorage : ICloudBlobStorage
{
    private readonly ICloudBlobStorage blobStorage;
    private readonly ICloudBlobStorage cacheStorage;
    private readonly ICloudStorageEntitySerializer dataSerializer;
    private readonly IList<ICloudBlobStorage> storageList;

    /// <summary>
    /// Initializes a new instance of the ReliableHybridBlobStorage class using the specified storage account information and caching
    /// service endpoint. Assumes the default use of RetryPolicy.DefaultExponential retry policy and default implementation of
    /// ICloudStorageEntitySerializer interface.
    /// </summary>
    /// <param name="storageAccountInfo">The access credentials for Windows Azure storage account.</param>
    /// <param name="cacheEndpointInfo">The endpoint details for Windows Azure AppFabric Caching Service.</param>
    public ReliableHybridBlobStorage(StorageAccountInfo storageAccountInfo, CachingServiceEndpointInfo cacheEndpointInfo)
        : this(storageAccountInfo, cacheEndpointInfo, new CloudStorageEntitySerializer())
    {
    }

    /// <summary>
    /// Initializes a new instance of the ReliableHybridBlobStorage class using the specified storage account information, caching
    /// service endpoint and custom implementation of ICloudStorageEntitySerializer interface. Assumes the default use of
    /// RetryPolicy.DefaultExponential retry policies when accessing storage and caching services.
    /// </summary>
    /// <param name="storageAccountInfo">The access credentials for Windows Azure storage account.</param>
    /// <param name="cacheEndpointInfo">The endpoint details for Windows Azure AppFabric Caching Service.</param>
    /// <param name="dataSerializer">The component which performs serialization and deserialization of storage objects.</param>
    public ReliableHybridBlobStorage(StorageAccountInfo storageAccountInfo, CachingServiceEndpointInfo cacheEndpointInfo, ICloudStorageEntitySerializer dataSerializer)
        : this
        (
            storageAccountInfo,
            new RetryPolicy<StorageTransientErrorDetectionStrategy>(RetryPolicy.DefaultClientRetryCount, RetryPolicy.DefaultMinBackoff, RetryPolicy.DefaultMaxBackoff, RetryPolicy.DefaultClientBackoff),
            cacheEndpointInfo,
            new RetryPolicy<CacheTransientErrorDetectionStrategy>(RetryPolicy.DefaultClientRetryCount, RetryPolicy.DefaultMinBackoff, RetryPolicy.DefaultMaxBackoff, RetryPolicy.DefaultClientBackoff),
            dataSerializer
        )
    {
    }

    /// <summary>
    /// Initializes a new instance of the ReliableHybridBlobStorage class using the specified storage account information, caching
    /// service endpoint, custom retry policies and a custom implementation of ICloudStorageEntitySerializer interface.
    /// </summary>
    /// <param name="storageAccountInfo">The access credentials for Windows Azure storage account.</param>
    /// <param name="storageRetryPolicy">The custom retry policy that will ensure reliable access to the underlying blob storage.</param>
    /// <param name="cacheEndpointInfo">The endpoint details for Windows Azure AppFabric Caching Service.</param>
    /// <param name="cacheRetryPolicy">The custom retry policy that will ensure reliable access to the Caching Service.</param>
    /// <param name="dataSerializer">The component which performs serialization and deserialization of storage objects.</param>
    public ReliableHybridBlobStorage(StorageAccountInfo storageAccountInfo, RetryPolicy storageRetryPolicy, CachingServiceEndpointInfo cacheEndpointInfo, RetryPolicy cacheRetryPolicy, ICloudStorageEntitySerializer dataSerializer)
    {
        Guard.ArgumentNotNull(storageAccountInfo, "storageAccountInfo");
        Guard.ArgumentNotNull(storageRetryPolicy, "storageRetryPolicy");
        Guard.ArgumentNotNull(cacheEndpointInfo, "cacheEndpointInfo");
        Guard.ArgumentNotNull(cacheRetryPolicy, "cacheRetryPolicy");
        Guard.ArgumentNotNull(dataSerializer, "dataSerializer");

        this.dataSerializer = dataSerializer;
        this.storageList = new List<ICloudBlobStorage>(2);

        this.storageList.Add(this.cacheStorage = new ReliableCloudCacheStorage(cacheEndpointInfo, cacheRetryPolicy, dataSerializer));
        this.storageList.Add(this.blobStorage = new ReliableCloudBlobStorage(storageAccountInfo, storageRetryPolicy, dataSerializer));
    }
}
At this point, I’m going to conclude the saga by including one more code snippet showing the implementation of the Put operation in the hybrid overflow store.
/// <summary>
/// Puts a blob into the underlying storage, overwrites the existing blob if the blob with the same name already exists.
/// </summary>
/// <typeparam name="T">The type of the payload associated with the blob.</typeparam>
/// <param name="containerName">The target blob container name into which a blob will be stored.</param>
/// <param name="blobName">The custom name associated with the blob.</param>
/// <param name="blob">The blob's payload.</param>
/// <returns>True if the blob was successfully put into the specified container, otherwise false.</returns>
public bool Put<T>(string containerName, string blobName, T blob)
{
    return Put<T>(containerName, blobName, blob, true);
}

/// <summary>
/// Puts a blob into the underlying storage. If the blob with the same name already exists, overwrite behavior can be customized. 
/// </summary>
/// <typeparam name="T">The type of the payload associated with the blob.</typeparam>
/// <param name="containerName">The target blob container name into which a blob will be stored.</param>
/// <param name="blobName">The custom name associated with the blob.</param>
/// <param name="blob">The blob's payload.</param>
/// <param name="overwrite">The flag indicating whether or not overwriting the existing blob is permitted.</param>
/// <returns>True if the blob was successfully put into the specified container, otherwise false.</returns>
public bool Put<T>(string containerName, string blobName, T blob, bool overwrite)
{
    Guard.ArgumentNotNull(blob, "blob");

    bool success = false;
    Stream blobData = null;
    bool treatBlobAsStream = false;

    try
    {
        // Are we dealing with a stream already? If yes, just use it as is.
        if (IsStreamType(blob.GetType()))
        {
            blobData = blob as Stream;
            treatBlobAsStream = true;
        }
        else
        {
            // The specified item type is something else rather than a Stream, we perform serialization of T into a new stream instance.
            blobData = new MemoryStream();

            this.dataSerializer.Serialize(blob, blobData);
            blobData.Seek(0, SeekOrigin.Begin);
        }

        try
        {
            // First, make an attempt to store the blob in the distributed cache.
            // Only use cache if blob size is optimal for this type of storage.
            if (CloudUtility.IsOptimalCacheItemSize(blobData.Length))
            {
                success = this.cacheStorage.Put<Stream>(containerName, blobName, blobData, overwrite);
            }
        }
        finally
        {
            if (!success)
            {
                // The cache option was unsuccessful, fail over to the blob storage as per design decision.
                success = this.blobStorage.Put<Stream>(containerName, blobName, blobData, overwrite);
            }
        }
    }
    finally
    {
        if (!treatBlobAsStream && blobData != null)
        {
            // Only dispose the blob data stream if it was newly created.
            blobData.Dispose();
        }
    }

    return success;
}
What a journey! I wish I didn’t have to write all of the code. One day, the standard APIs will be smart enough to solve challenges like this one out-of-the-box.

This post would be considered incomplete if I failed to provide some examples of how the storage abstraction layer discussed above can be consumed from a client application. I will combine these examples with a test application that will also validate the technical implementation.

Validation

In order to prove that large messages can successfully pass back and forth through the newly implemented storage abstraction layer, a very simple console application was put together. In the first step, it takes a sample XML document of 90MB in size and puts it on a Windows Azure queue. In the second step, it consumes a message from the queue. The message should indeed be the original XML document which is written back to the disk under a different name to be able to compare the file size. In between these steps, the application enters a pause mode during which you can explore the content of the queue and respective message overflow store such as cache or blob container. The source code for the test application is provided below.
using System;
using System.IO;
using System.Configuration;
using System.Xml.Linq;

using Contoso.Cloud.Integration.Framework;
using Contoso.Cloud.Integration.Framework.Configuration;
using Contoso.Cloud.Integration.Azure.Services.Framework.Storage;

namespace LargeQueueMessageTest
{
    class Program
    {
        static void Main(string[] args)
        {
            // Check if command line arguments were in fact supplied.
            if (null == args || args.Length == 0) return;

            // Read storage account and caching configuration sections.
            var cacheServiceSettings = ConfigurationManager.GetSection("CachingServiceConfiguration") as CachingServiceConfigurationSettings;
            var storageAccountSettings = ConfigurationManager.GetSection("StorageAccountConfiguration") as StorageAccountConfigurationSettings;

            // Retrieve cache endpoint and specific storage account definitions.
            var cacheServiceEndpoint = cacheServiceSettings.Endpoints.Get(cacheServiceSettings.DefaultEndpoint);
            var queueStorageAccount = storageAccountSettings.Accounts.Get(storageAccountSettings.DefaultQueueStorage);
            var blobStorageAccount = storageAccountSettings.Accounts.Get(storageAccountSettings.DefaultBlobStorage);

            PrintInfo("Using storage account definition: {0}", queueStorageAccount.AccountName);
            PrintInfo("Using caching service endpoint name: {0}", cacheServiceEndpoint.Name);

            string fileName = args[0], queueName = "LargeMessageQueue";
            string newFileName = String.Format("{0}_Copy{1}", Path.GetFileNameWithoutExtension(fileName), Path.GetExtension(fileName));

            long fileSize = -1, newFileSize = -1;

            try
            {
                // Load the specified file into XML DOM.
                XDocument largeXmlDoc = XDocument.Load(fileName);

                // Instantiate the large message overflow store and use it to instantiate a queue storage abstraction component.
                using (var overflowStorage = new ReliableHybridBlobStorage(blobStorageAccount, cacheServiceEndpoint))
                using (var queueStorage = new ReliableCloudQueueStorage(queueStorageAccount, overflowStorage))
                {
                    PrintInfo("\nAttempting to store a message of {0} bytes in size on a Windows Azure queue", fileSize = (new FileInfo(fileName)).Length);

                    // Enqueue the XML document. The document's size doesn't really matter any more.
                    queueStorage.Put<XDocument>(queueName, largeXmlDoc);

                    PrintSuccess("The message has been succcessfully placed into a queue.");
                    PrintWaitMsg("\nYou can now inspect the content of the {0} queue and respective blob container...", queueName);

                    // Dequeue a message from the queue which is expected to be our original XML document.
                    XDocument docFromQueue = queueStorage.Get<XDocument>(queueName);

                    // Save it under a new name.
                    docFromQueue.Save(newFileName);

                    // Delete the message. Should remove the metadata message from the queue as well as blob holding the message data.
                    queueStorage.Delete<XDocument>(docFromQueue);

                    PrintInfo("\nThe message retrieved from the queue is {0} bytes in size.", newFileSize = (new FileInfo(newFileName)).Length);

                    // Perform very basic file size-based comparison. In the reality, we should have checked the document structurally.
                    if (fileSize > 0 && newFileSize > 0 && fileSize == newFileSize)
                    {
                        PrintSuccess("Test passed. This is expected behavior in any code written by CAT.");
                    }
                    else
                    {
                        PrintError("Test failed. This should have never happened in the code written by CAT.");
                    }
                }
            }
            catch (Exception ex)
            {
                PrintError("ERROR: {0}", ExceptionTextFormatter.Format(ex));
            }
            finally
            {
                Console.ReadLine();
            }
        }

        private static void PrintInfo(string format, params object[] parameters)
        {
            Console.ForegroundColor = ConsoleColor.White;
            Console.WriteLine(format, parameters);
            Console.ResetColor();
        }

        private static void PrintSuccess(string format, params object[] parameters)
        {
            Console.ForegroundColor = ConsoleColor.Green;
            Console.WriteLine(format, parameters);
            Console.ResetColor();
        }

        private static void PrintError(string format, params object[] parameters)
        {
            Console.ForegroundColor = ConsoleColor.Red;
            Console.WriteLine(format, parameters);
            Console.ResetColor();
        }

        private static void PrintWaitMsg(string format, params object[] parameters)
        {
            Console.ForegroundColor = ConsoleColor.Gray;
            Console.WriteLine(format, parameters);
            Console.ResetColor();
            Console.ReadLine();
        }
    }
}
For the sake of completeness, below is the application configuration file that was used during testing. If you are going to try the test application out, please make sure you modify your copy of app.config and add the actual storage account credentials and caching service endpoint information.
<?xml version="1.0"?>
<configuration>
  <configSections>
    <section name="CachingServiceConfiguration" type="Contoso.Cloud.Integration.Framework.Configuration.CachingServiceConfigurationSettings, Contoso.Cloud.Integration.Framework, Version=1.0.0.0, Culture=neutral, PublicKeyToken=23eafc3765008062"/>
    <section name="StorageAccountConfiguration" type="Contoso.Cloud.Integration.Framework.Configuration.StorageAccountConfigurationSettings, Contoso.Cloud.Integration.Framework, Version=1.0.0.0, Culture=neutral, PublicKeyToken=23eafc3765008062"/>
  </configSections>

  <CachingServiceConfiguration defaultEndpoint="YOUR-CACHE-NAMESPACE-GOES-HERE">
    <add name="YOUR-CACHE-NAMESPACE-GOES-HERE" authToken="YOUR-CACHE-SECURITYTOKEN-GOES-HERE"/>
  </CachingServiceConfiguration>

  <StorageAccountConfiguration defaultBlobStorage="My Azure Storage" defaultQueueStorage="My Azure Storage">
    <add name="My Azure Storage" accountName="YOUR-STORAGE-ACCOUNT-NAME-GOES-HERE" accountKey="YOUR-STORAGE-ACCOUNT-KEY-GOES-HERE"/>
  </StorageAccountConfiguration>
</configuration>
Provided the test application has been successfully compiled and executed, the output similar to the following is expected to appear in the console windows:

If you peek into the storage account used by the test application, the following message will appear on the queue:

Since the test message was large enough to overflow directly into the blob storage, the following screenshot depicts the expected content inside the respective blob container while the test application is paused:

Note how the original 90MB XML document used in my test became a 11MB blob. This reflects the 87% savings on storage and bandwidth which was the result of applying XML binary serialization. Given the target class of scenarios, XML binary serialization + compression is the first and best choice.

Once the test application proceeds with deletion of the queue message, the metadata message is expected to be removed along with the blob holding the message data as shown on the screenshot below:

The example shown above reflects a simplistic view on the lifecycle of a large message. It is intended to highlight the fundamentals of the storage abstraction layer such as large message routing into blob store, transparent compression, automatic removal of both message parts. I guess, it’s now the right time to jump to a conclusion.

Conclusion

As we have seen, the use of Windows Azure Queues can be extended to support messages larger than 8KB by leveraging the Windows Azure AppFabric Caching and Windows Azure Blob services without adding any additional technical restrictions on the client. In fact, I have shown that with a little extra work you can enhance the messaging experience for the client by providing quality of life improvements such as:

Transparent message compression to reduce storage costs and save bandwidth into/out of the datacenter.

Transparent, easily customizable overflow of large messages to Cache or Blob storage.

Generics support that allows you to easily store any object type.

Automatic handling of transient conditions for improved reliability.

As I mentioned earlier, while this solution can use both distributed cache and blob store for overflow storage, the use of Windows Azure AppFabric Caching Service incurs additional costs. You should carefully evaluate the storage requirements of your project and perform a cost analysis based on projected number of messages and message size before deciding to enable overflow using AppFabric Cache.

While this solution provides an easy to use means of supporting large messages on Windows Azure queues, there is always room for improvement. Some examples of value-add features that are not incorporated in this solution and which you may wish to add are:

The ability to configure the type of large message overflow store in the application configuration.

The additional custom serializers in case the default one does not meet your performance goals or functional needs (for instance, you don’t need the default compression).

An item in the blob’s metadata acting as a breadcrumb allowing you to scan through your blob storage and quickly find out if you have any orphaned large message blobs (zombies).

A “garbage collector” component that will ensure timely removal of any orphaned blobs from the overflow message store (in case queues are also accessed by components other than storage abstraction layer implemented here).

The accompanying sample code is available for download from the MSDN Code Gallery. Note that all source code files are governed by the Microsoft Public License as explained in the corresponding legal notices.

Additional Resources/References

For more information on the topic discussed in this blog post, please refer to the following:

“Understanding Data Storage Offerings on the Windows Azure Platform” article in the TechNet Wiki Library.

“Windows Azure Storage Architecture Overview” post on the Windows Azure Storage Team blog.

“Queue Message Max size 8KB?” discussion on the MSDN forum.

“Azure Queue Messages cannot be larger than 8192 bytes” post on Rinat Abdullin’s blog.

“Best practices of using CloudQueueMessage.MaxMessageSize?” discussion on the MSDN forum.

“Best Practices for Maximizing Scalability and Cost Effectiveness of Queue-Based Messaging Solutions on Windows Azure” post on the Windows AppFabric CAT blog.

Did this blog post help you? Please give us your feedback. Tell us on a scale of 1 (poor) to 5 (excellent), how would you rate this post and why have you given it this rating? For example:

Are you rating it high due to having quality code samples, self-explanatory visuals, clear writing, or another reason?

Are you rating it low due to poor examples, fuzzy screen shots, or unclear writing?

Your feedback will help us improve the quality of guidance we release. Thank you!

Authored by: Valery Mizonov
Reviewed by: Brad Calder, Christian Martinez, Curt Peterson, Jai Haridas, Larry Franks, Mark Simms

•• Sidharth Ghag explained a Mystery of the Windows Azure Diagnostics Performance Counter table in a 6/6/2011 post to the InfoSys blog:

Are Performance Counter Diagnostics logs not getting created in your Azure development storage? Or is your Azure development fabric displaying similar error messages, while getting your specific performance counter to register

PdhAddCounter(\Process(MonAgentHost#0)\ID Process) failed

The problem could lie with pointers to your system performance counters being corrupted. Here is a step by step guidance to investigate and resolve the issue.

Symptoms:

Azure development fabric displaying Error message PdhAddCounter(\Process(MonAgentHost#0)\ID Process) failed"

WADPerformanceCounters Azure table not created in the Azure development Storage

Performance Counters configured in the Windows Azure application are not being captured in the WADPerformanceCounters table

Identify cause:

Build a separate application (there are several samples easily available on the public domain), preferably outside the cloud development environment and access the performance counters programmatically. Do you get the following exception?

"Cannot load Counter Name data because an invalid index was read from the registry

Yes, then here is what you need to do?

Steps to rectify problem:

1. Launch Command window in elevated mode

a. Click Start

b. Type CMD -> Triggers search for cmd and displays cmd.exe

c. Right click on cmd.exe

d. Click on "Run as administrator"

Command Window is launched in Administrator mode

2. Repair performance counter pointers ( those stored in registry)

a. Type Lodctr /r     //to reset counters, will lead to pointers being enabled

Performance Counters to be rebuilt from system backup store

3. Check status of performance counters

a. Type Lodctr /q    //to check state of counters

Status of performance counters listed

4. If any of the performance counter is disabled run the following command to enable the counter

a. Type Lodctr /e: <Name of performance counter which is disabled>     //To enable counter

<Name of performance counter> is the string between the [ ] at the beginning of the entry listed after running the command previous step

Performance counter is enabled

Check resolution:

Run your Windows Azure application in the development fabric with the Performance counter configured as per the instructions provided here

The WADPerformanceCounters log table should now be visible in the development storage.

NOTE: The above problem is more likely to occur within the local development environment than on the cloud hosted instance, primarily due to the standardized nature of the VM instances hosting the Windows Azure application. A point I would like to highlight here is that, more often than not it has been observed that issues arise primarily due to the disparity in system level configurations which exist between the local development environment and the Azure VM instances. Although Azure development fabric offers a cloud like compute environment on-premise to emulate the Azure hosted compute services, it does not offer complete simulation of the cloud environment on-premise. The Azure emulators do not shield from low-level system specific configurations of the Windows environment and which can impact certain capabilities of Azure not behaving as expected on-premise or at time even working differently once the application gets hosted on Azure. So on troubleshooting issues with your Windows Azure application first look within in the guts of your system, the problem may be closer to you than you realize.
     …

•• My Scott Guthrie Reports Some Windows Azure Customers Are Storing 25 to 50 Petabytes of Data post of 6/10/2011 reported:

Updated 6/12/2011: See updated original post here.

At 00:36:33 in the live stream archive of his Windows Azure and Cloud Computing keynote to the Norwegian Developers Conference (NDC) 2011 on 6/8/2011, Scott Guthrie (@scottgu) said:

But some of our customers we have in Azure today, for example, are doing 25 to 50 petabytes. That’s kind of their target range, of sorts, which is a lot of storage.

You aren’t kidding that’s a “lot of storage”, Scott. At Microsoft’s current storage price of US$0.15 per GB/month, that’s US$0.15 * 1,000,000 = US$ 150,000/PB-month or billings of $US 3.75 to 7.5 million/month.

For more details about new Windows Azure and SQL Azure features, see my New Migration Paths to the Microsoft Cloud cover story for Visual Studio Magazine’s June 2011 issue and Michael Desmond’s Windows Azure Q&A with Roger Jennings of 6/10/2011.

Scott also discussed SQL Azure and said at 00:42:18:

We also do autosharding as part of SQL Azure, which means that from a scale-out perspective, we can handle super-high loads, and we do all of that kind of load-balancing and scale-out work for you.

Today SQL Azure supports up to 50 GB of relational storage for a database, but you can have any number of databases. In the future, you’ll see us support hundreds of Gigabytes and Terabytes [that] you can take advantage of.

Scott jumped the gun a bit on the autosharding front. Autosharding is a component of the SQL Azure Federations program, which Cihan Biyikoglu, who’s a senior program manager for SQL Azure, described in his Federations Product Evaluation Program Now Open for Nominations! post of 5/13/2011:

Microsoft SQL Azure Federations Product Evaluation Program nomination survey is now open. To nominate an application and get access to the preview of this technology, please fill out this survey.

Let me take a second to explain the program details: the preview program is a great fit for folks who would like to get experience with the federations programmability model. As part of the preview, you get access to very early bits on scale-minimized versions of SQL Azure. This minimized version does not provide high availability or high performance and scale just yet as it runs under a different configuration. Those properties will come when we deploy to the public cluster. However you can exercise the programmability model and develop a full fledged application. There may still be minor changes to the surface but will be incremental small changes at this point. Participants will also get a chance to provide the development team with detailed feedback on the federations technology before it appears in SQL Azure. The preview program is available to only a limited set of customers. Customer who are selected for the program, will receive communication once the program is kicked off in the months of May and June 2011.

Of course, you won’t need to autoshard SQL Azure when it supports “hundreds of Gigabytes and Terabytes”.

For more details about sharding SQL Azure databases and SQL Azure Federations, see my Build Big-Data Apps in SQL Azure with Federation cover story for Visual Studio Magazine’s March 2011 issue.

Note: The NDC 2011 site provided an updated bio:

Scott Guthrie is corporate vice president of Microsoft's Azure Application Platform team, and runs the development teams responsible for delivering Microsoft’s Windows Azure, AppFabric, and Web Server Technologies and Tools. [Emphasis added.

A founding member of the .NET project, Guthrie has played a key role in the design and development of Visual Studio and the .NET Framework since 1999. Today, Guthrie directly manages the development teams that build ASP.NET, WCF, Workflow, IIS, AppFabric, WebMatrix and the Visual Studio Tools for Web development.

Guthrie graduated with a degree in computer science from Duke University.

Chris J. T. Auld reported Apple iCloud Running on Windows Azure and Amazon S3 in a 6/10/2011 post to his Syringe.net.nz blog:

Hi All,

Saw this rather interesting snippet come through on the Twitter feed this morning.

http://mobilitydigest.com/icloud-brought-to-you-by-microsoft-and-amazon/

Clicking through shows you some partial HTTP messages (some stuff is blurred out)

Certainly looks like iCloud is using both Azure Blob storage and S3.

They are probably using Blob storage for the ability to use Shared Access Signatures for file upload, but, without seeing the full URL in the HTTP request that’s a bit of a guess.

My guess is that the call to the iCloud servers for authorizePut will be fetching a SAS and then this is being used in the PUT request to the Blob storage endpoint.

There is a header in there called AuthorizationBSharedKey. I certainly hope that’s not the storage account shared key for the Azure storage account! But again, without seeing the full messages I can’t really tell. It’s certainly not a standard Azure header, but, it does have a somewhat worrying name.

Anyone able to pull the headers in full for some analysis? chris(at)syringe.net.nz

SQL Azure Database and Reporting

I updated My Improper Billing for SQL Azure Web Database on Cloud Essentials for Partners Account post on 6/9/2011 for what appears to be the final adjudication of the $9.99 overcharge for a free SQL Azure Web database benefit:

Update 6/9/2011 3:00 PM PDT: It appears as if I was right about the possibility of a billing snafu for Windows Azure Platform Cloud Essentials for Partners accounts:

Update 6/9/2011 12:45 PM PDT: The saga continues after I sent a message about my questions to Ram:

The account in question is not Windows Azure Platform Introduction Special for Partners, it’s a Windows Azure Platform Cloud Essentials for Partners account, which requires considerable effort to obtain and enables one to use the Powered By Windows Azure logo (see my live OakLeaf Systems Azure Table Services Sample Project - Paging and Batch Updates Demo project as an example.) Here’s a screen capture for the subscription (at top):

As far a I’ve been able to determine, there is no 90-day limit on use of a free SQL Azure Web database. See the Microsoft Windows Azure Cloud Essentials Pack screen capture below.

Mea Culpa: I was wrong about having no database in the SQL Azure server. I had a single database (plus master, which is free.)

Additional question: Why can’t I get rid of the Deactivated subscriptions?

This issue indicates to me that there is a bug in the billing system for my benefit.

See the SQL Azure content of My Scott Guthrie Reports Some Windows Azure Customers Are Storing 25 to 50 Petabytes of Data post of 6/10/2011 in the Azure Blob, Drive, Table and Queue Services section above for Scott’s take on SQL Azure sharding.

MarketPlace DataMarket and OData

•• Pierre Menard posted Open Data Formats for Life Science: RDF and OData to the Pistoia Foundation blog on 6/11/2011:

Thanks to everyone who attended the open meeting of the Pistoia Alliance Technical Committee on June 6. We had representatives from about 20 organizations on the call to hear about RDF and OData and the application of these open data and query formats in life science. If you were unable to attend the webinar, we recorded the session below.

The webinar ran an hour and a half, so if you want to skip to particular parts of the discussion, here are some markers:

My introduction: through 10:00

Bob Stanley (IO Informatics) on RDF: 10:00 to 23:52

Eric Prud’hommeaux (W3C) on RDF: 23:54 to 45:00

Pablo Castro (Microsoft) on OData: 45:00 to end

I sent around a survey to everyone who participated in the webinar to gauge their interest in these technologies. If you have opinions to share on their utility or have experiences implementing them at your site, I’d love to hear from you in the comments.

The Pistoia Alliance is a global, not-for-profit, precompetitive alliance of life science companies, vendors, publishers, and academics. It appears to me to have most of the world’s large pharma manufacturers as members.

• Ralf Handl (@ralfhandl) reported SAP NetWeaver Gateway speaks OData with SAP Annotations on 6/9/2011:

[Net Weaver Gateway] URL: http://www.sdn.sap.com/irj/sdn/gateway

When reading the announcement for SAP NetWeaver Gateway or the SAP NetWeaver Gateway page in SDN, the sentence I like most is that Gateway is “leveraging REST services and OData/ATOM protocols”. Let me explain why.

Why REST?

I don’t want to get into the discussion of “SOAP versus REST”, that’s far too dangerous a battleground for me to enter lightheartedly, so I’ll just list the three main reasons why I personally like REST:

I can use my browser to see what data I will get.

If I’ve got one snippet of information, it will most likely lead me to other, related snippets of information.

If I know where to GET data, I know where to PUT it, and I can use the very same format.

I’ve used SOAP services for some time, and every single one of the above would have made my life easier.

Why OData?

AtomPub is the de-facto standard for treating bunches of similar information snippets (just look around who else is using it … or who isn’t J). It’s simple, it’s extensible, and it allows putting anything into its content. Well, anything textual at least, and you can use media link entries to point to the binary stuff.

Now a lot of the textual enterprise data we deal with is structured, so we need some way to express what structure to expect in a certain kind of information snippet.

And as these snippets can come in large quantities, we need to trim them down to manageable chunks, sort them according to ad-hoc user preferences, then step through the result set page by page.

OData gives us all of these features, and some more, like feed customization that allows mapping part of the structured content into the standard Atom elements, and the ability to link data entities within an OData service (via “…related…” links) and beyond (via media link entries). So we can support a wide range of clients with different capabilities:

Purely Atom, just paging through data

Hypermedia-driven, navigating through the data web, and

Aware of query options, tailoring the OData services to their needs

And beyond that OData is extensible, just as the underlying AtomPub, so we can add some features that we discovered to need when building easy-to-use applications, both mobile and browser-based.

Why SAP Extensions?

What do we need for building user interfaces? First of all, human-readable, language-dependent labels for all properties.

Next, we need free-text search, within collections of similar entities, and across. OpenSearch to the rescue: it can use the Atom Syndication Format for its search results, so the found OData entities fit right in, and it can be integrated into AtomPub service documents via links with rel=”search”, per collection as well as on the top level. The OpenSearch description specifies the URL template to use for searching, and guess what: for collections it just points to the OData entity set, using a custom query option with the obvious name of “search”.

For apps running on mobile devices, we want to seamlessly integrate into contacts, calendar, and telephony, so we need semantic annotations to tell the client which of the OData properties contain a phone number, a part of a name or address, or something related to a calendar event.

Not all things are equal, and not all entities and entity sets will support the full spectrum of possible interactions defined by the uniform interface, so capability discovery will help clients avoiding requests that the server cannot fulfill, so the metadata document will tell whether an entity set is searchable, which properties may be used in filter expressions, and which properties of an entity will always be managed by the server.

Most of the applications we are focusing on now for “light-weight consumption” follow an interaction pattern called “view-inspect-act”, “alert-analyze-act”, or “explore & act”, meaning that you somehow navigate (or are lead) to an entity that interests you, and then you have to choose what to do. The chosen action will eventually result in changes to this entity, or entities related to it, but it may be tricky to express it in terms of an update operation, so we advertise the available actions to the client as special atom links – with an optional embedded simplified “form” in case the action needs parameters – and the action is triggered by POSTing to the target URI of the link.

Putting all that together we arrive at the following simplified picture:

And Gateway isn’t the only SAP product speaking OData with SAP extensions: the new Sales On Demand collaborative UI uses it internally (and it's internally available only, sorry!) to access SAP Business ByDesign. Watch this SAPPHIRE NOW keynote (starting at 61:40) to see it in action both in a browser and on mobile devices.

I’m curious to see what people will build on top of SAP NetWeaver Gateway!

Ralf Handl works at SAP AG in the Product Architecture team of the Technology & Innovation Platform. He is currently focussing on easy consumption of SAP data via REST and OData.

Turker Keskinpala (@tkes) reported a OData Service Validation Tool Update in a 6/10/2011 post to the OData Blog:

We pushed a new update to the OData Service Validation Tool. As you know we are updating the service every 2 weeks. Below is what's new in this update:

Added 5 new JSON rules. Including the new rules pushed 2 weeks ago we now have a total of 16 JSON rules

Fixed test result classifications for MustNot and ShouldNot requirement levels

Added Rule Extension Framework for code rules.

We have been focusing on structural rules so far. We added an extension mechanism to the rule engine to be able to add code based (semantic) rules. There are no such rules in the system yet but we are working on bringing such rules in to the rule engine and the UI.

In the meantime, we are also actively working towards open sourcing the tool. We are sorry for the lack of updates on this front but we are committed to making the tool open source and we are waiting for the legal process to be completed. We investigated our options and are currently working with the legal department to create the most appropriate contribution model so that it's also possible for the community to contribute rules.

There is no ETA for the release at the moment but we are in final stages of obtaining legal approvals. We will immediately announce when the source code is available.

Thank you for your continued feedback and interest. We are all ears on the OData Mailing List. Please let us know if you have any feedback, questions and/or suggestions.

Jon Udell (@judell) asserted “Facebook may not be great for event listings, but it could be a useful conduit” as a deck for his Why Facebook isn't the best home for your public events article of 6/10/2011 for the O’Reilly Radar blog:

In an earlier episode of this series I discussed how Facebook events can flow through elmcity hubs by way of Facebook's search API. Last week I added another, and more direct, method. Now you can use a Facebook iCalendar URL (the export link at the bottom of Facebook's Events page) to route your public events through an elmcity hub.

The benefit, of course, is convenience. If you're promoting a public community event, Facebook is a great way to get the word out and keep track of who's coming. Ideally you should only have to write down the event data once. If you can enter the data in Facebook and then syndicate it elsewhere, that seems like a win.

In Syndicating Facebook events I explain how this can work. But I also suggest that your Facebook account might not be the best authoritative home for your public event data. Let's consider why not.

Here's a public event that I'm promoting:

Here's how it looks in a rendering of the Keene elmcity hub:

And here's the link to the End of the world (again) event:

https://www.facebook.com/event.php?eid=207438602626457

Did you click it? If so, one of two things happened. If you were logged into Facebook you saw the event. If not you saw this:

Is this a public event or not? It depends on what you mean by public. In this case the event is public within Facebook but not available on the open web. The restriction is problematic. Elmcity hubs are transparent conduits, they reveal their sources, curators do their work out in the open, and communities served by elmcity hubs can see how those hubs are constituted. Quasi-public URLs like this one aren't in the spirit of the project.

My end-of-the-world event is obviously an illustrative joke. But consider two other organizations whose events appear in that elmcity screenshot: the Gilsum Church and the City of Keene. These organizations are currently using Google Calendar to manage their public events. They use Google Calendar's widget to display events on their websites, and they route Google Calendar's iCalendar feeds through the elmcity hub.

Now that elmcity can receive iCalendar feeds from Facebook, the church and the city could use their Facebook accounts, instead of Google Calendar, to manage their public events. Should they? I think not. Public information should be really public, not just quasi-public.

What's more, organizations should strive to own and control their online identities (and associated data) to the extent they can. From that perspective, using services like Google Calendar or Hotmail Calendar are also problematic. But you have choices. While it's convenient to use the free services of Google Calendar or Hotmail Calendar, and I recommend both, I regard them as training wheels. An organization that cares about owning its identity and data, as all ultimately should, can use any standard calendar system to publish a feed to a URL served by a host that it pays and trusts, using an Internet domain name that it paid for and owns.

Either way, how could an organization manage its public event stream using standard calendar software while still tapping into Facebook's excellent social dynamics? Here's what I'd like to see:

It's great that Facebook offers outbound iCalendar feeds. I'd also like to see it accept inbound feeds. And that should work everywhere, by the way, not just for Facebook and not just for calendar events. Consider photos. I should be able to pay a service to archive and manage my complete photo stream. If I choose to share some of those photos on Facebook and others on Flickr, both should syndicate the photos from my online archive using a standard feed protocol -- say Atom, or if richer type information is needed, OData.

The elmcity project is, above all, an invitation to explore what it means to be the authoritative source of your own data. Among other things, it means that we should expect services to be able to use our data without owning our data. And that services should be able to acquire our data not only by capturing our keystrokes, but also by syndicating from URLs that we claim as our authoritative sources.

From Jon’s http://jonudell.net/bio.html: In 2007 Udell joined Microsoft as a writer, interviewer, speaker, and experimental software developer. Currently he is building and documenting a community information hub that's based on open standards and runs in the Azure cloud.

Windows Azure AppFabric: Access Control, WIF and Service Bus

• Eve Maler (@xmlgrrl) asked Participating In Markets For Portable Identities In The Cloud: What’s The Coin Of Your Realm? in a 6/10/2011 post to her Forrester Research blog:

Many IT security pros are moving toward disruptive new authentication and authorization practices to integrate securely with cloud apps at scale. If you’re considering such a move yourself, check out my new report, The “Venn” of Federated Identity. It describes the potential cost, risk, efficiency, and agility benefits when users can travel around to different apps, reusing the same identity for login.

Aggregate sources of identities are large enough now to attract significant relying-party application “customers” – but the common currency for identity data exchange varies depending on whether the source is an enterprise representing its (current or even former) workforce, a large Web player representing millions of users, or other types of identity providers. These days, the SAML, OAuth, and OpenID technologies are the hard currencies you’ll need to use when you participate in these identity markets. You can use this report to start matching what’s out there to your business scenarios, so you can get going with confidence.

Related Forrester Research

The 'Venn' Of Federated Identity

• Azret Botash (@Ba3e64) posted OAuth Library – The Basics of Implementing a Service Provider on 6/10/2011:

Last time I showed you how to connect to Facebook, Twitter and Google using the DevExpress OAuth Library. This time, let me show you an example of the most basic OAuth 1.0 service provider. Before you dig into the implementation however, I want to point you to a prerequisite article on http://hueniverse.com by Eran Hammer-Lahav. Eran does a very good job explaining the protocol flow.

How to implement an OAuth Service Provider

Live Demo

• The AppFabric Team Blog posted Updated IP addresses for AppFabric Data Centers on 6/10/2011:

Today (6/10/2010) the Windows Azure AppFabric has updated the IP ranges on which the AppFabric nodes are hosted. If your firewall restricts outbound traffic, you will need to perform the additional step of opening your outbound TCP ports and IP addresses for these nodes. Please see the 1/28/2010 “Additional Data Centers for Windows Azure AppFabric” for full list of IP ranges, which was updated today to include changes shown below.

Updates made today are shown below:

Added IPs

United States (South/Central)
157.55.196.0/22, 157.55.200.0/22

United States (North/Central)
65.52.106.240/28, 65.52.106.16/28, 65.52.107.0/28, 65.52.106.224/28, 65.52.106.32/27, 65.52.106.64/27, 65.52.106.160/27, 65.52.106.192/27, 65.52.106.96/27, 65.52.106.128/27, 157.55.24.0/21, 157.55.208.0/21, 157.55.60.240/28

United States (North/West)
65.52.98.96/28, 65.52.103.128/27, 65.52.98.96/28, 65.55.19.64/26, 65.52.99.0/24, 65.52.101.0/24, 65.55.25.96/28

Europe (North)
157.55.3.0/24

Europe (West)
213.199.128.0/20, 213.199.180.112/28, 213.199.180.32/28, 213.199.180.96/28, 213.199.180.192/26, 213.199.183.0/24, 157.55.8.128/28, 157.55.8.144/28, 157.55.8.160/28, 157.55.8.64/26

Asia (Southeast)
207.46.48.0/20, 111.221.16.0/21, 111.221.80.0/20, 111.221.96.0/20

Asia (East)
207.46.72.0/26, 207.46.89.16/28, 207.46.95.32/27, 207.46.77.224/28, 207.46.87.0/24, 207.46.67.160/27, 207.46.67.192/27

Removed IPs

None.

• Michael Washam posted Understanding Windows Azure AppFabric Queues, a 00:40:16 interview with Clemens Vasters and Murali Krishnaprasad to Channel9 on 6/10/2011:

Interview with Principal Technical Lead Clemens Vasters and Principal Development Manager Murali Krishnaprasad (MK) regarding the May 2011 CTP release of Windows Azure AppFabric. We discuss new technologies such as Topics, Queues, Subscriptions and how this relates to doing async development in the cloud.

Resources:

Download for the May CTP

Michael Washam's Blog

Clemens Vasters Blog

Steve Marx (@smarx) posted Cloud Cover Episode 47 - Queues and Topics in the Windows Azure AppFabric Service Bus on 6/10/2011:

Join Wade and Steve each week as they cover the Windows Azure Platform. You can follow and interact with the show at @CloudCoverShow.

In this episode, Brent Stineman joins Steve to discuss the newly-released Topics and Queues features in the Windows Azure AppFabric Service Bus.

In the news:

AzureWatch - Autoscaling a Windows Azure Hosted Service

New Windows Azure Service Management API Features Ease Management of Storage Services

Fujitsu Launches Global Cloud Platform Service Powered by Microsoft Windows Azure

Windows Azure Pricing Calculator

Windows Azure SDK for PHP v3.0 Released

Read more about Queues and Topics and get the full source code for Brent's demo app over on his blog.

Scott Densmore reported the availability of A Guide to Claims Based Identity Hands on Labs on CodePlex in a 6/9/2011 post:

We just finished our last bit of testing on these hands on labs. They are a great companion to the guide that we released last year as an RC. One of the things I am really excited about in these hands on labs is our labs on AD FS V2. In our guide we talk about using AD FS V2 as an Identity Provider / STS and now we show you how to do it. Download both the guide and the hands on labs from CodePlex and provide us feedback.

The AD FS V2 content map can be found here. It has a lot of great content about AD FS V2 that can keep you on the path.

Windows Azure VM Role, Virtual Network, Connect, RDP and CDN

•• Haddicus (@Haddicus) claimed Azure Connect – Not as easy as they say it is in a 6/1/2011 post to his Haddicus Development blog:

For the [last] couple months, I have been working on testing some cloud related applications to fully take advantage of cloud storage, shared sessions across different nodes, and cloud to location via a virtual network utilizing Azure Connect. Each presents their own challenges. Finding a provider that has a high level of uptime, and the ability to scale is very important. So far I have been pretty happy with the ability of Windows Azure to scale to different app sizes, as well as, have all the technologies that seem to bridge the gap for each section.

Cloud related technologies truly can be difficult, sometimes, to truly understand how to handle some things that are often taken for granted when the servers are inside a network, and completely manageable locally. Simple things like, remoting into a machine, all the way to, managing session state between nodes that have been extended.

Two other obstacles are to maintain dependencies on servers that require certain software to be installed on the server at runtime, as well as, ensure greater security of important business logic and services required for a software solution. The two major ways of dealing with this type of issue is to either:

Deploy custom images to the cloud that are prebuilt with the software you need to run in the cloud.

Make a library available outside of the cloud, that can be called from the cloud, with software dependencies and requirements that might not be feasible in the cloud.

Each has it’s own positives and negatives.

First, a custom cloud image can take some time to deploy. This cloud image, not only needs to be deployed, but updated after deployment. Cloud images contain server software that is critical to keep up-to-date, and it can take a while, even if already imaged, to deploy said image just due to maintenance. Uploading these deployments can also take some time, since these images can be several gigabytes in size.

Second, a local library extends the amount of moving parts that are involved in the system, making it harder to debug some issues, due to the different areas of breakage. One nice thing, though, exceptions thrown locally will be available for viewing on the event viewer, so you essentially have another tier of error reporting, off of the cloud (for dependencies within the said module). A big positive here, you can leverage the ability to maintain certain libraries and services that have software and dependency requirements that just don’t make sense for the cloud. For example, you have an order module that sends information to a backend system, but requires 2-3 other dependencies that lie on your network. A cloud solution really isn’t feasible, since you would have to make multiple connections, as well as, would be required to not only install software on a custom image and deploy it, but test connection between these nodes, on every instance, as well as, ensure the remote nodes can handle so many different connections between instances as they grow.

Keeping this in mind, sometimes it makes sense to go the second route, to connect a remote library or service to your instances to provide those much needed dependencies, reliably and responsibly. To do so, we really need to have a connector to ensure services can be accessed from within a network, this is where Azure Connect comes in.

Azure connect is a program that works with Windows Azure, to help bridge the gap between endpoints (your servers, on your own network), and your published software. These are the steps I took to setup Azure Connect between my local desktop, and Azure Connect:

Created the WCF project and published it via localhost to a directory on my local computer.

Created a cloud project that accesses the local service based on its qualified name [computer].[mydomain].com.

Ensured it would properly work on localhost, as well as, the service accessible to other computers on my network.

Published the cloud project and ensured that it is working fine without the connection to the endpoint being referenced.

Downloaded the Endpoint client from the Azure portal and installed & confirmed I am connected.

Moved my Endpoint into a group.

Connected the group to the Role I want to connect to

The tutorial I followed, though, doesn’t go into how to troubleshoot nodes, nor how to ping external entities. To enable ping, you must have startup command to open the firewall to enable ping. To do so, create a file ‘enableping.cmd’, and insert the following into it:

Code Snippet

Echo Enable ICMP

netsh advfirewall firewall add rule name="ICMPv6in" dir=in action=allow enable=yes protocol=icmpv6

exit /b 0

Now that you have the file, we must tell the application to run this command on startup. On your service definition inside your endpoint setup project, inside your WebRole configuration:

Code Snippet

<Startup>

<Task commandLine="enableping.cmd" executionContext="elevated" taskType="simple"/>

</Startup>

This will allow you to ping to the cloud. To enable pinging from the cloud to your own server, you MUST run the same command on your server. To do so, you can open a command prompt window and insert the same command:

Once this is done, you should be able to ping back and forth between your server and the cloud, if not, there is something else wrong with your network setup. Ensuring the Azure Connect icon is showing that you are fully connected, and there are no issues, on both the cloud and server, are important steps to ensure this is working.

Another issue I ran into when working with this, was my port 80 only computer was not open for traffic that was seeking to consume the local service I had created. Once port 80 was opened, I had no more issues connecting to the service running on a local IIS deployment within my network from the cloud. Note, this service was not made available outside of my network, so Azure Connect provides this connectivity.

Everyday we hit challenges in new technologies, and the Cloud environment is definitely a challenging environment. There are a lot of things that can (and probably will) fail along the way. The only way to make it through is continuing even when you hit the wall. Eventually you will get through!

Live Windows Azure Apps, APIs, Tools and Test Harnesses

•• Thomas Conté (@tomconte) posted a Java sample application for Windows Azure on 6/12/2011:

I have published on GitHub the source code for a very simple Java Web application, that includes all the basic building blocks you would need to start developing a “real” Java application for Windows Azure. It takes the form of an Eclipse project that contains the following elements:

A “classic” JSP application (very 20th century ;-) created using the “Dynamic Web Project” template in Eclipse
The Windows Azure SDK for Java, including all its pre-requisites in terms of third-party libraries
The JDBC Driver for SQL Server and SQL Azure (version 3.0) that will allow you to connect to SQL Azure
The almost-latest version of Hibernate (3.6.4), because that’s what most people use to access their database
A simple Hibernate configuration for SQL Azure
A micro-application that show how to use Blob Storage and SQL Azure

The idea is to provide Java developers something that is “ready to use” and does not require to gather all the components. I hope to make the sample application a little bit more complete to show additional elements, notably Table Storage.

Here is what the application looks like:

And here is some information about the various components in the application.

First, the Hibernate configuration:

<!-- Database connection settings -->
<property name="connection.driver_class">com.microsoft.sqlserver.jdbc.SQLServerDriver</property>
<!-- Local database settings (e.g. SQL Express) -->
<property name="connection.url">jdbc:sqlserver://localhost:1433;databaseName=Chinook;</property>
<property name="connection.username">sa</property>
<property name="connection.password">Pass123!</property>
<!-- SQL Azure connection settings -->
<!--<property name="connection.url">jdbc:sqlserver://YOURSERVER.database.windows.net:1433;databaseName=Chinook</property>-->
<!--<property name="connection.username">YOURLOGIN@YOURSERVER</property>-->
<!--<property name="connection.password">YOURPASSWORD</property>-->
<!-- JDBC connection pool (use C3P0) -->
<property name="hibernate.c3p0.min_size">5</property>
<property name="hibernate.c3p0.max_size">20</property>
<property name="hibernate.c3p0.idle_test_period">60</property>
<property name="hibernate.c3p0.max_statements">100</property>

I have included two possible configurations: “localhost” for local development, where you would typically use a local SQL Express database, and a sample SQL Azure configuration, with a typical URL and the login string that has to follow a slightly specific format (user@host).

I have also configured a C3P0 connection pool, with an “idle_test_period” set to 60 seconds, which is required when connection to SQL Azure: the Windows Azure infrastructure will automatically drop all TCP connections idle for more than 60 seconds!

The rest of the configuration is very simple, I have modeled three classes (Album, Artist, Track) from the Chinook sample database you can download from CodePlex.

I then have a very simple class showing how you can access SQL Azure, and also use the Windows Azure Blob Storage to store some pictures.

public class AlbumService {
	protected static final String BLOB_HOST_NAME = "http://tcontepub.blob.core.windows.net/";
	protected static final String BLOB_CONTAINER_NAME = "ledzep";
	
	public List<DisplayAlbum> getAlbumsForArtist(String artistName) {
		List<DisplayAlbum> displayAlbums = new ArrayList<DisplayAlbum>();
		
    	Session session = HibernateUtil.getSessionFactory().getCurrentSession();
    	
        session.beginTransaction();
		List albums = session.createQuery("from Album a where a.artist.name='" + artistName + "'").list();
        for (int i=0; i < albums.size(); i++) {
        	Album a = (Album)albums.get(i);
        	
        	String blobName = a.getTitle() + ".jpg";
        	String containerName = BLOB_CONTAINER_NAME;
        	String img = null;
        	
        	if (BlobUtil.blobExists(containerName, blobName)) {
            	img = BLOB_HOST_NAME + containerName + "/" + blobName;
        	}
        	DisplayAlbum da = new DisplayAlbum();
        	da.setTitle(a.getTitle());
        	da.setAlbumId(a.getAlbumId());
        	if (img != null) {
        		da.setCover(img);
        	}
        	displayAlbums.add(da);
        }
        session.getTransaction().commit();
        
        return displayAlbums;
	}
}

Here, I use Hibernate to issue a SQL request and find a list of albums corresponding to a given artist name. Once I have a list of albums, I issue requests to the Blob Storage to see if I can find a JPEG file bearing the same name (representing the picture of the album cover). If I find a picture, I add it to the view object I will use to display the information. This is a trivial example, but allows me to demonstrate, in the BlobUtil class, how to access Blob Storage:

public class BlobUtil {
	  //protected static final String BLOB_HOST_NAME      = "http://blob.core.windows.net/";
	  //protected static final String AZURE_ACCOUNT_NAME  = "YOURACCOUNT";
	  //protected static final String AZURE_ACCOUNT_KEY   = "YOURKEY";
	  //protected static final boolean PATH_STYLE_URIS	= false;
	  protected static final String BLOB_HOST_NAME      = "http://127.0.0.1:10000/";
	  protected static final String AZURE_ACCOUNT_NAME  = "devstoreaccount1";
	  protected static final String AZURE_ACCOUNT_KEY   = "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==";
	  protected static final boolean PATH_STYLE_URIS	= true;
	  	  
	  public static boolean blobExists(String containerName, String blobName) {
		  BlobStorageClient storageClient = BlobStorageClient.create(
				  URI.create(BLOB_HOST_NAME),
				  PATH_STYLE_URIS,
				  AZURE_ACCOUNT_NAME,
				  AZURE_ACCOUNT_KEY);
		  
		  IBlobContainer container = storageClient.getBlobContainer(containerName);
		  
		  return container.isBlobExist(blobName);
	  }
}

As you can see in this example, the API is very simple: a call to container.isBlobExist() will test for the existence of a given Blob.

The static parameters are by default configured to point to the storage emulator, running locally; the Storage Emulator is part of the Windows Azure SDK you installed on your development machine. For a live deployment to Windows Azure, you will use the other set of parameters, where you will enter your storage account and secret key. You will find both these values in the Windows Azure administration portal, when you create a Storage account.

And finally, a little JSP page will assemble everything:

<jsp:useBean id="AlbumService" class="org.azurejava.sample.AlbumService" scope="page" />
<div id="albums">
<%
	List<DisplayAlbum> albums = AlbumService.getAlbumsForArtist("Led Zeppelin");
	Iterator<DisplayAlbum> i = albums.iterator();
	while (i.hasNext()) {
		DisplayAlbum a = i.next();
%>
	<div class="album">
		<div><%= a.getCover() != null ? "<img class=\"albumcover\" src=\"" + a.getCover() + "\">" : "" %></div>
		<div><%= a.getTitle() %></div>
	</div>
<%
	}
%>
</div>

You can execute this application in a local Tomcat instance installed on your machine. You do not need to execute it within the Windows Azure Compute Emulator, because it does not use any specific Compute features (like multiple Roles). However you will need the Storage Emulator if you want to develop storage code locally.

However, in order to deploy it into Windows Azure (either the local Emulator or the live Fabric), you will need to package it; this is where you can use the Windows Azure Starter Kit for Java, as I will show you in a future post! :-)

I will certainly evolve this Java sample, so follow it on GitHub!

Thomas is an Architect Evangelist at Microsoft France specializing in Windows Azure, Web Platform & Interoperability.

•• Steve Marx (@smarx) uploaded SmarxRole, a multi-language Windows Azure Role to Codeplex on 6/7/2011 (missed when uploaded):

Smarx Role is a Windows Azure role that supports publishing web applications written in Node.js, Ruby, and Python. Apps are published/synchronized via Git or blob storage, allowing nearly instantaneous changes to published applications. It automatically pulls in dependent modules using each language's package manager (npm, Gem, or pip).

For an overview of the Smarx Role and how it works, see smarx's talk at MIX 2011 (slides).

Important Note
This software, when executed, requires you to download and install programs from third-parties that are in no way affiliated with Microsoft. These programs are: Python 2.7.1, Ruby 1.9.2, Node.js 0.4.7, and msysgit 1.7.4. The licenses for those products are between you and the relevant third parties and are available at the projects' homepages:

http://www.python.org

http://www.ruby-lang.org/en/

http://nodejs.org

http://code.google.com/p/msysgit/

Usage

The simplest way to deploy a Smarx Role is to grab one of the precreated packages from the latest release, edit ServiceConfiguration.cscfg, and deploy via the Windows Azure portal.

For the actual application bits that you deploy, the requirements are relatively simple:

The application must reside either at a Git URL or in a blob container (details specified in ServiceConfiguration.cscfg).

The application should be written in Python, Ruby, or Node.js, with a file called app.py, app.rb, or app.js.

The application will be executed with the parameters -p #### -e production, where #### is the port the application should listen on for incoming HTTP requests.

The application should specify dependent modules via pip, Gem, or npm conventions. (Smarx Role will execute pip install -r Pipfile, bundle install, and npm install --verbose for you before running your code.)

CoffeeScript note
If you want, you can write CoffeeScript instead of JavaScript when using Node.js. Smarx Role will execute the command coffee -o . src/*.coffee before executing your code, so, for example, you'll want a file called src/app.coffee if you're writing your main Node.js application in CoffeeScript. (Of course, you can simply compile your .coffee down to .js before committing/uploading it.) [See article below.]
Examples

Python: http://github.com/smarx/blobedit (running at http://blobedit.cloudapp.net)

Ruby: http://github.com/smarx/smarxblog (running at http://blog.smarx.com)

Node.js: http://github.com/smarx/smarxchat (running at http://chat.smarx.com)

Steve also published smapi.coffee as a Git.Gist as an example of Node.js, CoffeeScript, and the Windows Azure Service Management API on 6/9/2011:

###
certificate generated via:
    openssl req -x509 -nodes -days 365 -subj "/CN=test" -newkey rsa:1024 -keyout priv.pem -out pub.pem
    openssl x509 -outform der -in pub.pem -out pub.cer
(pub.cer is what's uploaded to Windows Azure via the portal)
###

crypto = require 'crypto'
https = require 'https'
fs = require 'fs'
xml2js = require 'xml2js'
cli = require 'cli'

util = require 'util'

base64DecodeAll = (x) ->
    return unless x instanceof Object
    for name, value of x
        if name == 'Label'
            x[name] = (new Buffer value, 'base64').toString()
        else
            base64DecodeAll x[name]
    return

class ServiceManagementApi
    constructor: (@subscriptionId, @key, @cert) ->

    getOperatingSystemFamilies: (callback) ->
        this.call 'operatingsystemfamilies', 'GET', callback

    call: (path, method, body, callback) ->
        if not callback?
            if not body?
                callback = method
                method = 'GET'
                body = undefined
            else
                callback = body
                body = undefined
        options = {
            host: 'management.core.windows.net',
            port: 443,
            key: @key,
            cert: @cert,
            method: method,
            headers: { 'x-ms-version': '2010-10-28' },
            path: "/#{@subscriptionId}/#{path}"
        }
        req = https.request options, (res) ->
                body = ''
                res.on 'data', (chunk) -> body += chunk
                res.on 'end', (result) ->
                    parser = new xml2js.Parser
                    parser.on 'end', (result) ->
                        base64DecodeAll result
                        for name, value of result
                            callback value unless name == '@'
                        return
                    parser.parseString body
                    return
        req.write body if body?
        req.end()
        return

options = cli.parse {
    subscriptionId: ['s', 'Subscription ID (found in the Windows Azure portal)', 'string'],
    privateKey: ['p', 'Private key (.pem file)', 'file', 'priv.pem'],
    certificate: ['c', 'Public certificate (.pem file)', 'file', 'pub.pem']
}
new ServiceManagementApi options.subscriptionId,
    fs.readFileSync(options.privateKey, 'ascii'),
    fs.readFileSync(options.certificate, 'ascii')
.getOperatingSystemFamilies (result) ->
    console.log util.inspect result, false, 10

•• Jeremy Ashkenas (@jashkenas) offers detailed information about CoffeeScript here:

CoffeeScript is a little language that compiles into JavaScript. Underneath all of those embarrassing braces and semicolons, JavaScript has always had a gorgeous object model at its heart. CoffeeScript is an attempt to expose the good parts of JavaScript in a simple way.

The golden rule of CoffeeScript is: "It's just JavaScript". The code compiles one-to-one into the equivalent JS, and there is no interpretation at runtime. You can use any existing JavaScript library seamlessly (and vice-versa). The compiled output is readable and pretty-printed, passes through JavaScript Lint without warnings, will work in every JavaScript implementation, and tends to run as fast or faster than the equivalent handwritten JavaScript.

Latest Version: 1.1.1

Michael Desmond (@MichaelDesmond, pictured below) asked and I answered Windows Azure Q&A with Roger Jennings (@rogerjenn) in a 6/10/2011 post to Visual Studio Magazine’s Desmond File blog:

Roger Jennings wrote the June issue cover story for Visual Studio Magazine, titled "New Migration Paths to the Microsoft Cloud." We caught up with Roger earlier this week to get an update on Windows Azure developments and how Microsoft's efforts on products like LightSwitch and Windows 8 dovetail with the company's cloud strategy.

Michael Desmond: Scott Guthrie transitions into his new role this month. What do you think his Azure App Platform team will be up to and what technology improvements and services in your view should be top priorities? Have you seen any change in the Azure group yet?

Roger Jennings: So far, Scott’s been fulfilling prior commitments with a grand tour of London for an all-day "Gu-athon" presentation with a cloud development session on June 6, and then in Norway for a Norwegian Developers Conference to give the "Cloud Computing and Windows Azure" keynote on June 8. He's in Germany today for an IT&DevConnections Germany keynote ("Microsoft’s Web Platform").

In Oslo, Scott said Microsoft will be adding four more data centers to the current six in the next few months. He also mentioned that "some of the Azure customers we have today are storing 25 to 50 petabytes."

I recommend that Scott review the Windows Azure Feature Voting Forum for new feature requests. My favorite unfulfilled requests are full-text search and secondary indexes for table storage, as well as Transparent Data Encryption and full-text search for SQL Azure.

Scott said in Oslo that he’d been "working on Azure for the last two weeks," so he hasn’t had much time to leave his imprint on the Windows Azure Team.

MD: According to announcements at Tech-Ed North America last month, the June Windows Azure AppFabric CTP will offer developers a first look at Microsoft’s AppFabric Composition Model and related dev tools for managing multi-tier apps. What do you think of Microsoft’s approach and how is this problem solved by other cloud service providers?

RJ: Microsoft announced at PDC 2010 the AppFabric Composition Model and Visual Tools, as well as an App Fabric Container scheduled for a CTP in the first half of 2011. The AppFabric team reported at Tech-Ed 2011 that the June 2011 AppFabric CTP will include AppFabric Developer Tools to let you "visually design and build end-to-end applications on the Windows Azure platform," AppFabric Application Manager for runtime capabilities that enable "automatic deployment, management and monitoring of the end-to-end application" and provide "analytics from within the cloud management portal."

The Composition Model will be a "set of .NET Framework extensions for composing applications on the Windows Azure platform.... The Composition Model gets created by the AppFabric Developer Tools and used by the AppFabric Application Manager."

The June 2011 CTP isn’t available yet, but I expect it to provide a more advanced version of Amazon Web Services’ Elastic Beanstalk feature, as well as application management (DevOps) features offered by VMware and third-party cloud administration providers.

MD: Still no sign of the Windows Azure Platform Appliance at Tech-Ed. Any idea what’s happening with it?

RJ: News about the first commercial implementation of the Windows Azure Platform Appliance, which I call WAPA, finally emerged on June 7 in a joint Fujitsu/Microsoft press release. Fujitsu’s Global Cloud Platform service ("FGCP/A5") has been running WAPA on a trial basis in a Japanese data center since April 11. The service is scheduled for general release in August at ¥5 (US$0.0623) per hour for an Extra Small instance, not much more than Microsoft’s US$0.05 per hour charge.

Fujitsu says it expects to have "400 enterprise companies, 5,000 SMEs and ISVs [as customers] in a five-year period after the service launch." I was surprised to see a published sales target, which are uncommon for US tech firms. eBay has issued periodic details of their planned WAPA installation, but HP and Dell Computer, the other two partners Microsoft announced at last year’s World Wide Partners conference, have yet to announce their WAPA implementation plans.

MD: People also expected to see Visual Studio LightSwitch v1 after Beta 2 with a Go Live license was released in March. Have you heard anything about the final release?

RJ: Automated deployment of LightSwitch Beta 2 projects to Windows Azure was the most significant new feature for me. I’m a longtime Microsoft Access developer/writer, so I appreciate LightSwitch’s similar rapid application development (RAD) features with SQL Server and SQL Azure as back ends. As far as I know, Microsoft’s still mum (as usual) on LightSwitch’s RTM/RTW date. Beth Massi’s MSDN blog keeps you up to date on LightSwitch developments.

MD: What’s your take on the announcements at Tech-Ed and their significance to Windows Azure and SQL Azure developers?

RJ: New Windows Azure AppFabric features received most of the attention at Tech-Ed, probably because AppFabric is a primary distinguishing element of Microsoft’s PaaS (Platform as a Service). AppFabric’s May CTP is mostly about the Service Bus’s extension to support messaging with Queues, which replace the former Durable Message Buffers, and Topics, and are similar to Azure Data Services’ Queues.

Service Bus Queues provide dead-letter queues and message deferral. Topics deliver new pub/sub capabilities to Queues. Load balancing and traffic optimization for relay have been dropped temporarily in this CTP, but are expected to reappear later. The AppFabric Team and AppFabricCAT (Customer Advisory Team) blogs provide detailed explanations of these new features. The May CTP also includes an updated AppFabric Access Control Services (ACS) v2, which uses OpenID to integrate with Yahoo! and Google via the Azure Developer Portal, as well as other OpenID providers via management APIs. Vittorio Bertocci’s (@vibronet) MSDN blog is the best source of ACS v2 details.

MD: Windows 8 will support HTML and JavaScript as a first-class development target in the new OS, while support for Silverlight-based "immersive" apps via the Jupiter UI library is rumored. Any thoughts on how these Web-savvy platforms dovetail with Microsoft's cloud/Azure efforts?

RJ: You can be sure that Windows 8 will have many built-in "cloudy" features, but [some] will be provided by Windows Live Skydrive, not Windows Azure directly. Concentration on HTML+CSS+JavaScript development will result in a trend toward more use of OData and Windows Azure’s REST APIs than the currently popular .NET wrappers. Microsoft’s heavy investment in LightSwitch, a Silverlight-based RAD platform, is a strong indicator of continued Silverlight support in the Windows 8 era.

The Windows Azure Team posted Real World Windows Azure: Interview with Husam Laswi, IT Director of Factory Operations at Flextronics on 6/10/2011:

As part of the Real World Windows Azure series, we talked to Husam Laswi, IT Director of Factory Operations at Flextronics, about using the Windows Azure platform to deliver the Authorized Service Center (ASC) application to retail stores. Here’s what he had to say:

MSDN: What does Flextronics do?

Laswi: Flextronics is in the business of contract manufacturing. We tailor design, manufacturing, and services for electronics OEMs in market segments, including computing, medical, consumer mobile, power supply, automotive, and more.

MSDN: Tell us about the ASC application.

Laswi: Originally developed in 2008, ASC is a tool that Flextronics developed by using Microsoft ASP.NET. Retailers can use ASC at their retail repair shops to process repair service requests. For example, employees can use it to review the customer’s warranty, search for parts in inventory, prepare quotes, upload photos of products, and track the status of repairs.

MSDN: What was the biggest challenge Flextronics faced prior to implementing ASC on the Windows Azure platform?

Laswi: We wanted the solution to be scalable because we didn’t know how fast our customer would adopt the application at its retail locations. We didn’t have enough data centers to cover the many regions in which our customer conducts business. If you’re a retail store in Asia, you don’t want to worry about logging onto a data center in the United States. Plus, our timeline was limited. We knew that if we hosted the application in our own data centers, we would have had to deal with requisitions for capital expenditures.

Retailers can use ASC at their retail shops to process repair requests.

MSDN: Did you look at other cloud computing solutions?

Laswi: We evaluated cloud solutions including Salesforce.com and Amazon Elastic Compute Cloud (EC2). If we had used Salesforce.com, we would have needed to spend time and effort redeveloping the solution. By using Windows Azure, we could leverage existing ASC technology. We also determined that using Amazon EC2 would be like having another server that just happens to be in the cloud, and we wanted greater interoperability with our traditional development paradigm.

MSDN: Describe the solution you built with the Windows Azure platform?

Laswi: Our development partner, CloudXtension, a member of the Microsoft Partner Network, did all the development work to migrate ASC to the Windows Azure platform. It transformed a static ASP.NET application into a dynamic client-side application with a rich user interface. CloudXtension also developed a workaround to enable database object creation and reporting by using Microsoft SQL Azure. The application also uses Language Integrated Query (LINQ), a set of extensions to the Microsoft .NET Framework, to retrieve uploaded product photos, which ASC stores by using server-side Blob storage. ASC also uses Access Control, a part of Windows Azure AppFabric, to get access to a proprietary web application that provides estimated payment information.

MSDN: How long did it take to deploy to the Windows Azure platform?

Laswi: Starting in April 2010, CloudXtension conducted a proof-of-concept pilot program that featured the ASC application with basic functionality—checking in service orders—running on the Windows Azure platform. CloudXtension spent three months reworking ASC, followed by three months developing feature enhancements and conducting user acceptance tests. ASC went into production at one of our customer’s retail stores on September 15, 2010.

MSDN: What benefits have you seen since implementing the Windows Azure platform?

Laswi: We got ASC to market quickly, saved costs on capital expenditures and maintenance, and used familiar and reliable Microsoft technologies. Our customer is extremely happy with the Windows Azure implementation of ASC. We have been able to sustain our business and enable our customer to grow its business.

Read the full story at: www.microsoft.com/casestudies/casestudy.aspx?casestudyid=4000009957

To read more Windows Azure customer success stories, visit: www.windowsazure.com/evidence

Tony Bailey (a.k.a tbtechnet) reported a new Purchase 1 SQL Azure Core Offer and Receive $150 Rebate benefit in a 6/10/2011 post to TechNet’s Windows Azure Platform, Web Hosting, and Web Services blog:

There is a new SQL Azure rebate offer in market.

For those developers getting ready to start work on Azure, it seems like a good deal to lock in at lower rates and get up to $150 cash back*.

Highlights of the offer include:

Purchase 1 SQL Core offer and receive $150 rebate. That is essentially one third off a six month commitment

Until June 30^th

Lock in the lower rate now for up to 12 months

*Terms & Conditions.

Benzinga (@Benzinga) reported Exclusive - Microsoft Recognizes Innovation Excellence at Careerify on 6/10/2011:

Microsoft Canada announced today that Careerify, a leader in social media talent acquisition and employee engagement solutions, is the winner of the 2011 Blue Sky ISV Innovation Excellence Awards. The award recognizes Careerify for its market-changing approach to solving significant business problems, as well as for its innovative use of leading technologies.

Toronto, ON (PRWEB) June 10, 2011

Microsoft Canada announced today that Careerify, a leader in social media talent acquisition and employee engagement solutions, is the winner of the 2011 Blue Sky ISV Innovation Excellence Awards. The award recognizes Careerify for its market-changing approach to solving significant business problems, as well as for its innovative use of leading technologies. The awards are presented annually by Microsoft Canada Inc.

"We are very excited to see Careerify acknowledged for its innovative social recruitment and employee engagement solution that helps improve corporate branding and lower recruitment costs," said Harpaul Sambhi, Careerify's CEO and author of Social HR. "Careerify's on-demand solutions for recruitment and increased employment brand ultimately empower and engage employees throughout an organization to help improve its overall performance. Making this vision a reality required significant technological innovation."

Careerify's innovative use of Microsoft-based software has helped the company build applications with a combination of functionality, usability and scalability in order to become the Groupon for organization's internally for recruitment.

"Microsoft Canada is pleased to name Careerify as our 2011 Blue Sky Award winner," said Gladstone Grant, vice president, Developer and Platform Group, Microsoft Canada. "In a nation of gifted developers, Careerify's cloud-based social recruiting platform, built on the Windows Azure platform, has raised the bar of excellence displaying the creativity, innovation and forward thinking that is core to the values of the Awards."

The Blue Sky Award will enable Careerify to benefit from a customized engagement plan from Microsoft, including access to Microsoft resources in software and business development, to help drive engagement and increase market visibility.

About Careerify Corporation

Careerify is a social talent acquisition and employee engagement software that great organizations use to build their recruitment strategy on. This platform allows organizations to connect with their employee's vast social networks, creating a great opportunity to reach millions of candidates and find the best talent from those that you trust most; your employees.This results in less administration, improved tracking, greater visibility to your employment brand all while saving you money. For more information, please visit http://www.careerify.net.

About The Blue Sky ISV Innovation Excellence Awards

Established in 2008, The Blue Sky ISV Innovation Excellent Awards reward Canadian developers for creative thinking and innovation. The competition offers a platform for showcasing the great talents of Independent Software Vendors from coast-to-coast to an audience of Microsoft experts and other industry voices. The winning entrant will gain national recognition along with software development support, business development resources and increased visibility within Microsoft. For more information, please visit http://www.microsoft.com/canada/bluesky/.

Tony Bailey (a.k.a tbtechnet) posted a Windows Azure Treasure Map to TechNet’s Windows Azure Platform, Web Hosting and Web Services blog on 6/10/2011:

I’ve been searching the numerous resources on Windows Azure to try and help application developers save a bit of time.
My Colleagues Frank and Melanie pointed me to the links below.

Windows Azure 30-day pass no credit card required. Promo code = TBBLIF

Some of these links are for Microsoft Partners which, by the way pretty much anyone can become a Microsoft partner by joining here.

GENERAL

• Azure.com
• White Papers

DEVELOPER RESOURCES

• Developer Tools
• MSDN Windows Azure
• MSDN SQL Azure
• MSDN Windows Azure AppFabric
• Channel 9 – Azure Videos
• Windows Azure MSDN Forums
• Windows Azure Training Kit
• AppFabric LABS Portal
• Windows Azure Toolkit for Windows Phone 7

MICROSOFT LEARNING PLANS FOR PARTNERS

• Windows Azure Compete Training
• Windows Azure Platform Sales Training
• Windows Azure Technical Training
• Windows Azure Technical Training for Developers

PARTNER TRAINING

• Windows Azure Platform Technology Overview
• Building a Windows Azure Platform Practice
• Identify and Qualify Windows Azure Platform ‘Quick Win’ Opportunities
• Selling the Windows Azure Platform - PDC Features
• Understanding Windows Azure Platform Competition
• Understanding Windows Azure Platform Security Features for Business Decision Makers
• Sales Qualification Questionnaire
• Windows Azure System Integrators Partner Execution Playbook

ASSESSMENT RESOURCES

• Microsoft Assessment & Planning (MAP) Toolkit
• Video: Microsoft Assessment & Planning (MAP) Toolkit Demo

ASSESSMENT RESOURCES FOR PARTNERS
• Microsoft Assessment Tool (MAT) for Migration Applications to Windows Azure
• Video: Assessing an Enterprise Application's Migration to Windows Azure, and how to use the Migration Assessment Tool

Visual Studio LightSwitch and Entity Framework 4.1+

•• Julie Lerman (@julielerman) described MVC3.1 Scaffolding Magic with Database (or Model) First , Not Just Code First in a 6/12/2011 post:

The MVC3.1 scaffolding that was released at Mix can auto-magically create an EF 4.1 DBContext class, the basic CRUD code in your controller and the relevant views all in one fell swoop. (Don’t forget the additional scaffolding tools that will build things more intelligently, i.e. with a Repository (http://blog.stevensanderson.com/2011/01/13/scaffold-your-aspnet-mvc-3-project-with-the-mvcscaffolding-package/).

All of the demos of this, including my own [MVC 3 and EF 4.1 Code First: Here are my classes, now you do the rest, kthxbai] demonstrate the new scaffolding using Code First. In other words, just provide some classes and the scaffold will do the rest, including build the context class.

I saw a note on twitter from someone asking about using this feature with an EDMX file instead of going the code first way. You can absolutely do that. Here’s a simple demo of how that works and I’m using the in-the-box template in the MVC 3.1 toolkit -- though admittedly, for my own purposes, I’m more likely start with the template that creates a repository.

1) Start with a class library project where I’ve created a database first EDMX file.

2) Use the designer’s “Add Code Generation Item” and select the DbContext T4 template included with Entity Framework 4.1. That will add two templates to the project. The first creates a DbContext and the second creates a set of very simple classes – one for each of the entities in your model.

3) Add an ASP.NET MVC3 project to the solution.

4) Add a reference to the project with the model.

5) BUILD THE SOLUTION! This way the Controller wizard will find the classes in the other project.

6) Add a new controller to the project. Use the new scaffolding template (Controller with read/write actions and views using EF) and choose the generated class you want the controller for AND the generated context from the other project.

That will create the controller and the views. The controller has all of the super-basic data access code for read/write/update/delete of the class you selected. It’s a start.

Almost ready to run but first a bit of house keeping.

7) Copy the connection string from the model project’s app.config into the mvc project’s web.config.

8) The most common step to forget! Modify the global.asax which will look for the Home controller by default. Change it so that the routing looks for your new controller.

9) Time to test out the app. Here’s the home page. I did some editing also. It all just works.

I highly recommend checking out the alternate scaffolding templates available in the MVC3 scaffolding package linked to above.

•• The Microsoft Talent Network posted a Software Development Engineer in Test II Job for the next version of Visual Studio Lightswitch. From the job description:

Date: Jun 1, 2011

Location: Fargo, ND, US

Job Category: Software Engineering: Test

Location: Fargo, ND, US

Job ID: 751998-43346

Division: Server & Tools Business

The Visual Studio LightSwitch team is about to begin work on our next release targeting professional line-of-business (LOB) application developers building solutions for small and medium businesses. Our mission is to dramatically simplify the way modern LOB applications are built and deployed. We want to enable developers to focus on the unique needs and domain logic of the application and not the common scaffolding so that they write the code only they could write.

We are shipping our v1 release soon which leverages many technologies required to build modern LOB apps: Silverlight, Azure, Office, Entity Framework, WCF RIA Services, ASP.Net Authentication, and more. For our next release we are looking at adding new scenarios for OData and Windows next while continuing to expand existing scenarios based on customer feedback. This position will require you to be hands on with a wide array of technologies key to the Microsoft’s long term success. …

• Michael Washington (@ADefWebserver) asserted I Want To Make Cars Not Car Engines in a 6/11/2011 post:

A few years ago, a vendor for a million dollar project failed to deliver, and I was required to create a program to take attendance for 20,000 children a day, and I only had 10 days to complete it. This allowed a public agency to stay in compliance, and retain the funding to deliver services to thousands of needy families. I was only able to do this because I used DotNetNuke.

If I had LightSwitch, I could have completed it in a day (and I still would have put the LightSwitch application “inside” DotNetNuke)

In the LightSwitch forums, there are some who are disappointed in what they perceive as limitations and “incompleteness”. From what I can tell, their issues come down to this:

For the non-programmers, all they can really do is make tables and screens. Anything else and they need to learn Linq and that is not considered easy

Important things like Reporting require $$ and that may be a deal breaker for a lot of people

The LightSwitch team can do more with LightSwitch to make it more productive for the non-programmer

We could all agree that this is only .v1 and express hope for the future versions of LightSwitch and call it a day :)

My points:

The amount of Linq (and other code) is so small, a properly motivated non-programmer can create the code. People like myself are very motivated and ready to guide millions of people in using LightSwitch. Microsoft Excel formulas are very complex, yet millions of non mathematicians are able to construct them.

Yes $$ is required for additional functionality... and people will pay. People pay $300 for a "smart phone" because it gives them something they want. Vendors will sell a ton of plug-ins that people want. I saw this with DotNetNuke, http://Snowcovered.com sells a ton of "stuff" because people want this "stuff".

In .v1 we have Silverlight Custom Controls. This is the "back door" that has so far allowed me to create anything I desire in LightSwitch. It's as if the pre-fad housing kit said "oh yeah go ahead and use your own wood and materials in place of any of our walls you don't feel fits your needs".

I think we can agree on the limitations in LightSwitch, and the "official party line" is that "LightSwitch is intended for basic forms over data applications". But I keep insisting, and have a website devoted to the point, that they "opened huge back doors that you can dive an enterprise application through".

At it's core LightSwitch is EF / WCF RIA / MEF / Silverlight. I have coded this stuff by hand for years and trust me, even when you mine the iron ore yourself to create a car engine, in the end you still have just a car engine. The fact that you now get your car engine from a factory does not mean it is not as good or better than the ones you created yourself.

And that is my point, I have created dozens of tutorials over many years. I am tired of making car engines, I want to make cars. People drive cars not car engines.

At this stage of my career I care about the people. Technology for technologies sake no longer holds my interest like it once did. At this stage of my career I want to help people.

LightSwitch allows me to sit down, after my normal day job, and complete an application before it’s time for dinner.

More ...

Beth Massi (@bethmassi) reported Visual Studio LightSwitch on dnrTV! in a 6/10/2011 post:

Looks like I missed this! Carl released a dnrTV on Monday that we did together while I was in Montreal for DevTeach last week. Check it out:

Beth Massi on Advanced Visual Studio LightSwitch Beta 2

Here I walk through the array of customizations you can do with LightSwitch from writing LINQ queries, to adding custom code to the client and server projects, to using and then building your own extensions. You can download the sample application here: Contoso Construction - LightSwitch Advanced Development Sample

It’s always a fun time with Carl Franklin. One bummer deal is I didn’t have internet connection where we recorded the show but Carl’s humor made up for it. ;-)

Return to section navigation list>

Windows Azure Infrastructure and DevOps

Czaroma Roman asked Considering Cloud Computing with Microsoft Azure? in a 6/10/2011 post to the CloudTimes blog:

Microsoft plays a vital role in IT and it is now heavily involved in cloud computing. Most users find its cloud services to be influential, whereas some others are wrestling with the implications of cloud computing.

For some enterprises, cloud computing is a way to dodge from the traditional computer and software vendors. They realize that the value of cloud depend on how cloud services integrate with their own IT commitments and investments.

Microsoft’s Azure is the latest development software that allows developers to create applications for Cloud Computing. Azure is similar to Windows Server in the cloud. By extending the basic Microsoft SOA principles into cloud computing, it may provide the best cloud option available. It enables users to build, host and scale applications in Microsoft data centers.

Its value proposition revolves around the concept that users must create their enterprise IT infrastructure for peak load and high reliability operation. When normal data center elements won’t have the capacity to back up application resources, Microsoft’s Azure solution would work to achieve the necessary levels of availability. This solution fills enterprises processing needs.

Microsoft Azure implements service-oriented architecture (SOA) concepts, including workflow management (the Azure Service Bus). It means that unlike most cloud architecture, Azure solution needs to be based on elastic workload sharing between the enterprise and the cloud.

The Azure Services Platform encompasses all of the cloud services that Microsoft is offering.

Components of Microsoft Azure

Multiple sub-platforms that Microsoft calls “Roles” are within Azure. These “Roles” are responsible for executing the code, and providing an implementation of the Azure APIs that you may be referencing. It consists of the Web Role, the Worker Role and the Machine Role.

The Web Role provides Internet access to Azure applications, allowing Azure apps to function as online services. The Worker Role is a Windows Server executable task that can make outbound connections as needed such as linking an Azure cloud application back to the enterprise data center via Azure Connect. The Virtual Machine Role serves as a host for any Windows Server applications that aren’t Azure-structured.

Microsoft’s vision of developing applications that are Azure-aware and taking advantage of the most architected hybrid cloud IT architecture available from a major supplier are clearly demonstrated. Most cloud services create hybrids in order to join resources that are not linked. Azure on the other hand creates linkable IT architecture. The Azure Platform Appliance, which makes data centers operate like Microsoft’s, can be used by cloud users. This offers a private cloud and virtualization architecture, including a manageable way of improving server utilization and application availability.

There’s a great chance that your entire Windows Server operation can migrate to a Virtual Machine Role within Azure if your data center is based substantially on Windows Server. Another thing is the integration of Windows Server applications among themselves and keeping your Windows Server licenses current.

Is Azure cloud platform right for you?

There are ways to determine if Azure can benefit you. Here are three important points to consider:

First, check how much of your IT application base is Virtual Machine Role-compatible. Azure migration would be favorable if there is very little or no application is compatible; and if you’re using of Linux and other operating systems. You will also find Azure unattractive if you have already created a major commitment to virtualization within your data center. Azure improves in-house efficiency and your in-house virtualization may have already given the benefit that Azure provides. Also, virtualization has a linking with Linux.

Next, ask how much of your application base either self-developed or based on Azure-compliant software is. Azure has an incomparable ability to transfer work within the “Azure” domain and the cloud. Azure will better serve your needs as you utilize this ability. Whereas, a large number of Windows Server applications that can’t be Azure compliant won’t be good to be able to justify an Azure migration.

Finally, answer this question – how committed to SOA is your current operation? Azure’s AppFabric is a SOA framework, with the Service Bus corresponding to a SOA Enterprise Service Bus (ESB) extended online.

Having no SOA implementation or knowledge can make it difficult to adopt Azure. Acquiring Microsoft-compatible SOA/ESB software products or components, you can make sure of moving towards Azure-specific applications and achieving Azure’s comprehensive benefits.

Azure is a revolutionary platform that will become more and more significant over time.

The Windows Azure Team reported a Content Update: New Windows Azure Service Management API Content Now Available in a 6/10/2011 post:

The updated Windows Azure Service Management API content is now available. Refer to this content to learn how to programmatically create, update, or delete storage services using the following new methods: Create Storage Account, Update Storage Account, and Delete Storage Account. In addition, new versions of two existing Windows Azure Service Management API methods enable customers to obtain additional information about their deployments and subscriptions.

The new version of the Get Deployment method returns the following additional information

Instance size, SDK version, input endpoint list, role name, VIP, port

Update domain and fault domain of role instance

The new version of the List Subscriptions method returns the following additional information: OperationStartedTime and OperationCompletedTime

The request header to use the new versions of these methods is: “x-ms-version: 2011-06-01”

NOTE: This is an update to the post, “New Windows Azure Service Management API Features Ease Management of Storage Services”.

Lori MacVittie (@lmacvittie) asserted We had a successful #IPv6 day – but not everyone was so fortunate. But that’s why you test, isn’t it? as an introduction to her F5 Friday: IPv6 Day Redux post to F5’s DevCentral blog:

Application delivery controllers, i.e., load balancers, were an integral component in getting ready for World IPv6 Day. As we, as in the Internets, continue to plan a path toward full IPv6 support, they will continue to be a key piece of the migration puzzle regardless of the strategy (dual-stack, translation, tunnels) ultimately chosen.

At F5 we chose to eat our own dogfood and implement what is essentially a “dual-stack” strategy as a means to support IPv6 and thus participate in the testing. Our IT department leveraged BIG-IP LTM and GTM (for DNS) capabilities to ultimately get our IPv6 presence up and running. The details can be found in a case study, published on IPv6 day:

Customers can take advantage of BIG-IP devices’ dual stack capabilities and use their own DNS server to resolve IPv6 requests directly. The IT group used this method for its proof of concept. It configured the BIG-IP 3900 devices for IPv6, assigning them IPv6 virtual addresses that point to www.f5.com web servers—the same physical servers that already host www.f5.com on the IPv4 Internet.

No changes were made to the servers themselves. “We simply added new IPv6 addresses for most of our web properties to our DNS server in BIG-IP GTM and made those addresses publicly available on the IPv6 Internet,” says [Casey] Scott [Network Engineer in the IT group at F5]. Connected to both IPv4 and IPv6 networks, BIG-IP GTM now has valid A records (IPv4) and AAAA records (IPv6) for all F5 web properties, so it can answer DNS queries in either direction—client to server or server to client. For example, when BIG-IP GTM receives an A record request, it hands back an IPv4 address to the client; when it receives a quad-A (AAAA) record request, it hands back an IPv6 address to the client.

“We tested this method using several different IPv6 clients,” says Scott, “and it worked beautifully.” With the exception of links to third-party content available only on the IPv4 network, in all cases, F5 web properties worked as expected using IPv6- only client devices.

-- F5 Enables IPv6 Network Support in Record Time Using Existing F5 Tools, Technologies

Live testing went well, from our perspective – without any of the hiccups. We processed a fair amount of IPv6 traffic to our enabled site, which showed good participation from others as well as clients helping out by poking around. There also seems to be a fair amount of IPv6 activity in general, as our IPv6-enabled presence has not been “quiet” since it first went live – several weeks before IPv6 day was scheduled to occur.

A steady stream of IPv6 traffic is a good sign.

But while F5 experienced a relatively seamless transition to support IPv6 for this live test, many others did not. The NANOG mailing list was full of sites and issues arising from the day, including those arising from – you guessed it – load balancing issues.

Was participating until we hit a rather nasty Load balancer bug that took out the entire unit if clients with a short MTU connected and it needed to fragment packets (Citrix Netscaler running latest code). No fix is available for it yet, so we had to shut it down. Ran for about 9 hours before the "magic" client that blew it up connected.

-- IPv6 Day Non-Participants, NANOG

These bumps in the road to IPv6 were exactly the reason the Internet community at large needed a “live” test, and it’s the reason we’ll need another one. The migration from IPv4 to IPv6 will not happen over night, although it does need to happen more quickly than we may have once assumed given the record time in which depletion of IPv4 occurred. As we – and organizations – begin the move we’ll need more “live testing” days to flush out the bugs and bumps that can only be discovered by processing real traffic coming from real clients interacting with real applications. Lab testing simply isn’t good enough to assure ourselves and you that interoperability and a dual-version environment can be supported.

Just as odd application behavior can sometimes only appear under heavy load, so too is this true for protocols and networking components. Sometimes it’s only under high-load or in response to a request that isn’t specification perfect that we see issues arise that need to be addressed.

We’ll need more of these days in the future, though they will perhaps be less well publicized. Much in the same way NASA shuttle launches have become “old hat”, of real interest only to those keenly following such efforts, one would hope IPv6 “test” days would also become “old hat” over time, until one day we’re all running IPv6 exclusively.

For some astoundingly deep and #geek data regarding the day and performance, take some time to peruse RIPE NCC who is currently compiling data and presenting it in myriad ways as a means to help understand the impact of IPv6 on all aspects of performance.

Windows Azure Platform Appliance (WAPA), Hyper-V and Private/Hybrid Clouds

Steve Cimino posted Answers for the private cloud curious to SearchCloudComputing.com on 6/10/2011:

Few things intrigue a forward-thinking IT department more than private cloud. No more jam-packed server rooms, fewer performance bottlenecks; the improved economics and enhanced practicality promised by the cloud computing revolution. That's the idea, anyway.

But of course, there is a gulf between the concept and what's real. Enterprises must contend with legacy gear, so shifting today's server-based applications to a new platform requires a huge level of development and expensive man hours. Even if it's a secure, more trustworthy option than public cloud, many organizations would still safely dub private cloud as "on the horizon."

If you are planning for a private cloud, make sure to ask a lot of fundamental questions. What is private cloud? Who uses it? What will I get in return, and what is the risk? You need these answers, even if you're already equipped with some variation of an in-house cloud.

What is private cloud, and who wants it?
A common question is "What's the difference between virtualization and private cloud?" The simple answer is that private clouds feature metered usage, chargeback and on-demand self-service, all of which take it a step beyond the common virtualized server. Moving from virtualization to private cloud, however, is a logical progression for those intrigued by cloud computing.

Who exactly is intrigued by cloud, at least enough to allocate precious IT budget towards it, remains to be seen. In an October 2010 TechTarget survey, nearly two-thirds of few small and medium businesses that responded said "no thanks" when asked about their private cloud ambitions.

But a more recent survey noted that 60% of respondents claimed to have at least a partial private cloud infrastructure in place. In actuality, it's probably a combination of the two: Few organizations are ready to adopt a private cloud, but many have something they like to call "private cloud" in their data centers. …

Steve continues with

What are the major private cloud risks?
Will private offerings rule the cloud computing market?

topics.

Full disclosure: I’m a paid contributor to SearchCloudComputing.com.

Cloud Security and Governance

•• Anthony Savvas reported “The majority of cloud computing providers allocate just 10 per cent or less of IT resources to security, according to a survey from CA and security research firm the Ponemon Institute” as a deck for his 'Impending security standoff' between customers and cloud providers article of 6/12/2011 for ComputerWorld UK:

The majority of cloud computing providers allocate just 10 per cent or less of IT resources to security, according to a survey from CA and security research firm the Ponemon Institute.

The research showed that less than half of the respondents agreed or strongly agreed that "security is a priority". The study found that cloud providers are more focused on delivering benefits such as reduced costs and speed of deployment, rather than security.

Ponemon surveyed 103 cloud service providers in the US and 24 in six European countries for a total of 127 separate providers.

The results of the survey, said the companies, suggest there is a "pending security standoff between cloud providers and cloud users".

The study, "Security of Cloud Computing Providers", showed the majority of cloud providers (79 percent) allocate just 10 percent or less of IT resources to security or control-related activities.

"The focus on reduced cost and faster deployment may be sufficient for cloud providers now, but as organisations reach the point where increasingly sensitive data and applications are all that remains to migrate to the cloud, they will quickly reach an impasse," said Mike Denning, general manager for Security at CA Technologies.

He said, "If the risk of a breach outweighs potential cost savings and agility, we may reach a point of 'cloud stall' - where cloud adoption slows or stops, until organisations believe cloud security is as good as or better than enterprise security."

Other findings of the research found that less than 20 percent of cloud providers across the US and Europe viewed security as a competitive advantage.

And the majority of cloud providers (69 percent) believed security is primarily the responsibility of the cloud user. Just 16 percent of cloud providers felt security is a shared responsibility.

"Given the well-publicised concerns about the potential risks to organisations' sensitive and confidential information in the cloud, we believe it is only a matter of time until users of cloud computing solutions will demand enhanced security systems," said Larry Ponemon, chairman of the Ponemon Institute.

Cloud Computing Events

Bronwyn McNutt (@MissBronwyn, pictured below) requested on 6/10/2011 reposting of Ralph Squillace’s 5/30/2011 Free Windows Azure training with Scott Klein in San Francisco, June 13-14 post:

Hi all. If you're in the Bay Area and want to get up to speed on Windows Azure -- whether you want to learn how to use it or whether you want to validate your own approach! -- MVP Scott Klein of Blue Syntax Software (blog) (author and co-author of many books including Pro SQL Azure) is offering a free two-day hands-on training course in all of Windows Azure in downtown San Francisco on June 13-14 (registration and information here). I'll also be presenting and discussing the forthcoming release of the Windows Azure AppFabric June CTP including the AppFabric Development Platform and show you how to get your distributed cloud-based applications up and running quickly. In addition, my colleague Brian Swan will also be there to discuss using PHP and Odata and Java in Windows Azure.

Scott has tons of experience to help you understand Azure, its services, and get you started building applications over two days -- few could be better to learn from. I am really looking forward to it. If you're in the area and interested, please come.

Microsoft reported Sir Richard Branson to Speak at Microsoft Worldwide Partner Conference, Los Angeles, July 13 in a 6/9/2011 press release:

REDMOND, Wash. — June 9, 2011 — Sir Richard Branson is slated to speak at the Microsoft Worldwide Partner Conference (WPC) on July 13, 2011. He will speak about the idea of “Winning Together” in business, the theme of the WPC.

Speaker Information

As founder and president of Virgin Group, Branson expanded the Virgin brand into air and rail travel, hospitality and leisure, telecommunications, health and wellness, and clean energy through more than 300 companies in 30 countries. A leader in product innovation and customer service, he was knighted in 1999 for his “services to entrepreneurship,” and in 2004, launched Virgin Unite to tackle tough challenges facing the world.

About WPC

Wednesday, July 13, 2011

10:00 a.m. PST

Los Angeles Convention Center, 1201 South Figueroa Street, Los Angeles, Calif.

The WPC is an annual gathering for the Microsoft partner community to gain insights into the Microsoft business and technology roadmap for 2011, learn how to expand business possibilities with cloud services, Windows 7, and Windows Phone and experience the latest in solution innovations. This year, the WPC will be held in Los Angeles, CA on July 10-14, 2011 at the Los Angeles Convention Center (LACC). There are nearly 12,000 people registered already for the event. For more information or registration, go to http://DigitalWpc.com.

As of 6/10/2011 there were 61 session in the Cloud Services track.

Other Cloud Computing Platforms and Services

•• Maureen O’Gara claimed “Ex-Sun Exec Rich Green Is Apparently Skedaddling” in her Nokia CTO Reportedly Overcome by Microsoft Fumes post of 6/12/2011 to the Cloud Computing Journal:

Rich Green, the two-time Sun exec who's been CTO at Nokia for the last year, has skedaddled. Officially he's taken a leave of absence for personal reasons but nobody expects him to come back.

Apparently he was overcome by the Microsoft fumes since Nokia dove headlong into Microsoft's arms a few months ago. Green used to run Java at Sun, was a key witness at the Microsoft trial and left Sun the first time when the companies made peace.

According to a Finnish paper he would have preferred Nokia stake its future on MeeGo, the Linux OS it dumped when Microsoft and Win Phone entered the picture. His reported policy disagreement is with Nokia CEO Stephen Elop, who'll give him no sympathy being an ex-Microsoft man and the guy who brought Microsoft in to fend off Android and iPhone.

Nokia has tapped Henry Tirri, head of the Nokia Research Center, to fill in.

The company is on its second profit warning, its debt is practically junk, its stock price is in the toilet and it's blamed for the anticipated Q2 shortfall TI's announced this week because it didn't buy all the chips it was supposed to. Meanwhile, its rival are eating its lunch.

• Klint Finley (@klintron, pictured below) asked Cloud Poll: Does iCloud Actually Have Anything to Do with Cloud Computing? in a 6/10/2011 survey post to the ReadWriteCloud blog:

In a refreshingly bitter post at Tech Target, Carl Brooks wrote: "Apple iCloud is not cloud computing."

Brooks went on: "You know what iCloud is? Streaming media. In other words, it's a Web service. Not relevant to cloud; not even in the ballpark."

But there are certainly some cloudy elements to iCloud. At the very least, it's a software-as-a-service. It fulfills the cloud promise of providing anytime, anywhere access to data.

And what about the architecture? GigaOM's Derrick Harris took a look at what we know so far. It's interesting, but it's hard to say at this point whether there's anything particularly "cloudy" about it - how well it makes of use of virtualization, elastic provisioning resources.

Brooks, who is obviously suffering from a bit of cloud fatigue, wrote:

And you, IT person, grumpily reading this over your grumpy coffee and your grumpy keyboard, you have Apple to thank for turning the gas back on under the hype balloon. Now, when you talk about cloud to your CIO, CXO, manager or whomever, and their strange little face slowly lights up while they say, "Cloud? You mean like that Apple thing? My daughter has that..." and you have to explain it all over again, you will hate the words "cloud computing" even more.

So what do you think? Is iCloud actually cloud computing?

Does iCloud Actually Have Anything to Do with Cloud Computing?online surveys

Full disclosure: I am a paid contributor to TechTarget’s SearchCloudComputing.com.

Maureen O’Gara asserted “It’s supposed to let users link computing resources across their organizations into a single high-performance private cloud” as a deck to her Cloud Computing: IBM’s Pitching Private HPC Clouds post of 6/10/2010 to the Cloud Computing Journal:

IBM's got itself an HPC cloud for advanced scientific and technical computing workloads like analytics, simulations for product development, climate research and life sciences.

It's supposed to let users link computing resources across their organizations into a single high-performance private cloud rather than operate HPC silos and IBM preens that it's the only major vendor with a HPC private cloud solution.

The widgetry includes an HPC Management Suite, a quick-start cloud implementation service and Intelligent Cluster solutions with servers, storage and switches factory-integrated, tested and delivered ready to plug into the data center.

Blue is also ready to produce industry-specific versions of the stuff beginning with one for electronics companies and automotive and aerospace manufacturers.

It includes IBM Rational Software and Systems Engineering Solution, IBM Collaboration Hub, 2D/3D accelerators to create a secure, well managed cloud optimized for engineering environments along with applications from ISVs like Ansys, Cadence, Exa and Magma Design Automation.

Availability for everything starts sometime in Q3.

Matthew Weinberger (@MattNLM) reported HP, VMware Partner to Deliver Turnkey Cloud Virtualization in a 6/10/2011 post to the TalkinCloud:

HP and VMware took this week’s HP Discover conference in Las Vegas as an opportunity to announce the joint HP VirtualSystem Solutions with VMware Virtualization solution. Together, HP and VMware will deliver proven, tested infrastructure stacks in the form of appliances. They’re hyping it as a quick way to get enterprises of all sizes started with virtualization — and by extension, cloud computing in general.

Built on HP Converged Infrastructure, the appliances will deliver turnkey virtual infrastructure and a full compute stack – from server, and storage to networking and services - powered by HP technology, with VMware infrastructure and management software. According to the press release, the solution will be available in three different “scalable deployment options,” depending on the size and needs of the customer.

The two companies took time out to highlight the value to both VMware and HP solution providers, as HP VirtualSystem Solutions with VMware Virtualization can provide an easy way to leverage their expertise and deliver value with business continuity, automated management and other cloud benefits.

Honestly, details are fairly thin at this time – HP and VMware said the solution will be ready in the second half of 2011, and it sounds like we won’t find out any specifics until the launch gets closer. But stay tuned to TalkinCloud for updates on this news, as well as other insights from the HP Discover conference.

Still no news on HP’s adoption of the Windows Azure Platform Appliance (WAPA) as promised at last year’s Microsoft Worldwide Partners Conference.

Maureen O’Gara claimed “Tata Communications is reportedly about to launch its InstaCompute IaaS cloud in the US” as an intro to her Amazon’s About to Entertain a New Rival post of 6/10/2011 to the Cloud Computing Journal:

Amazon better watch its back.

Tata Communications, the $2.5 billion telecom arm of the even bigger $62.5 billion Tata Group, is reportedly about to launch its InstaCompute IaaS cloud in the US meaning to go toe to toe with AWS.

In Amazon's favor is the fact that most Americans think "ta-ta" is something Brits say when somebody's leaving, but maybe, just for a change, the India-based company will create some jobs over here.

According to the charming deck of slides out on the net it's a dead-ringer for Amazon down to the Xen virtualization with only the pricing to be argued over. Seems it's not just depending on debit or credit cards, it'll take corporate purchase orders too.

Apparently it's using Cloud.com and Dell servers and targeting the usual test and dev and web applications with a 99.95% SLA.

It might come in with a 30-day free trial.

It supports Windows and Linux OS including Fedora, Ubuntu and CentOS.