OakLeaf Systems: Windows Azure and Cloud Computing Posts for 6/3/2011+

A compendium of Windows Azure, Windows Azure Platform Appliance, SQL Azure Database, AppFabric and other cloud-computing articles.

••• Updated Sunday, 6/5/2011 with articles marked ••• from Michael Washington, Steve Nagy, Tim Negris, Simon Munro, Joe Hummel, Lohit Kashyapa and Martin Ingvar Kofoed Jensen (Evaluating combining Friday, Saturday and Sunday articles.)

•• Updated Saturday, 6/4/2011 with articles marked •• from Alik Levin, the Windows Azure Connect team, Azret Botash, Alex Popescu, Jnan Dash, George Trifanov, Doug Finke, Raghav Sharma, David Tesar, Elisa Flasko, MSDN Courses, Stefan Reid, John R. Rymer, Louis Columbus, Brian Harris and Me.

• Updated Friday, 6/3/2011 at 12:00 Noon PDT or later with articles marked • from Klint Finley, Kenneth Chestnut, the Windows Azure OS Updates team, Steve Yi, Alex James, Peter Meister, Scott Densmore and the Windows Azure Team UK.

Note: This post is updated daily or more frequently, depending on the availability of new articles in the following sections:

Azure Blob, Drive, Table and Queue Services •••
SQL Azure Database, Data Sync, Compact and Reporting •, ••
Marketplace DataMarket and OData •, ••
Windows Azure AppFabric: Access Control, WIF and Service Bus •, ••
Windows Azure VM Role, Virtual Network, Connect, RDP and CDN •, ••
Live Windows Azure Apps, APIs, Tools and Test Harnesses •••
Visual Studio LightSwitch and Entity Framework 4+ •, •••
Windows Azure Infrastructur and DevOps •, ••
Windows Azure Platform Appliance (WAPA), Hyper-V and Private/Hybrid Clouds ••, •••
Cloud Security and Governance ••
Cloud Computing Events •
Other Cloud Computing Platforms and Services •, ••, •••

To use the above links, first click the post’s title to display the single article you want to navigate.

•• See New Navigation Features for the OakLeaf Systems Blog for a description of bulleted items.

Azure Blob, Drive, Table and Queue Services

••• Martin Ingvar Kofoed Jensen (@IngvarKofoed) explained the difference of Azure cloud blob property vs metadata in a 6/5/2011 post:

Both properties (some of them) and the metadata collection of a blob can be used to store meta data for a given blob. But there are a small differences between them. When working with the blob storage, the number of HTTP REST request plays a significant role when it comes to performance. The number of request becomes very important if the blob storage contains a lot of small files. There are at least three properties found in the CloudBlob.Properties property that can be used freely. These are ContentType, ContentEncoding and ContentLanguage. These can hold very large strings! I have tried testing with a string containing 100.000 characters and it worked. They could possible hold a lot more, but hey, 100.000 is a lot! So all three of them can be used to hold meta data.

So, what are the difference of using these properties and using the metadata collection? The difference lies in when they get populated. This is best illustrated by the following code:
CloudBlobContainer container; /* Initialized assumed */
CloudBlob blob1 = container.GetBlobReference("MyTestBlob.txt");
blob1.Properties.ContentType = "MyType";
blob1.Metadata["Meta"] = "MyMeta";
blob1.UploadText("Some content");

CloudBlob blob2 = container.GetBlobReference("MyTestBlob.txt");
string value21 = blob2.Properties.ContentType; /* Not populated */
string value22 = blob2.Metadata["Meta"]; /* Not populated */

CloudBlob blob3 = container.GetBlobReference("MyTestBlob.txt");
blob3.FetchAttributes();
string value31 = blob3.Properties.ContentType; /* Populated */
string value32 = blob3.Metadata["Meta"]; /* Populated */

CloudBlob blob4 = (CloudBlob)container.ListBlobs().First();
string value41 = blob4.Properties.ContentType; /* Populated */
string value42 = blob4.Metadata["Meta"]; /* Not populated */

BlobRequestOptions options = new BlobRequestOptions 
   { 
      BlobListingDetails = BlobListingDetails.Metadata 
   };
CloudBlob blob5 = (CloudBlob)container.ListBlobs(options).First();
string value51 = blob5.Properties.ContentType; /* Populated */
string value52 = blob5.Metadata["Meta"]; /* populated */
The difference is when using ListBlobs on a container or blob directory and the values of the BlobRequestOptions object. It might not seem to be a big difference, but imagine that there are 10.000 blobs all with a meta data string value with a length of 100 characters. That sums to 1.000.000 extra data to send when listing the blobs. So if the meta data is not used every time you do a ListBlobs call, you might consider moving it to the Metadata collection. I will investigate more into the performance of these methods of storing meta data for a blob in a later blog post.

Michał Morciniec (@morcinim) described How to Deploy Azure Role Pre-Requisite Components from Blob Storage in a 6/3/2011 post to his ISV Partner Support blog:

In Windows Azure SDK1.3 we have introduced the concept of startup tasks that allow us to run commands to configure the role instance, install additional components and so on. However, this functionality requires that all pre-requisite components are part of the Azure solution package.

In practice this has following limitations:

if you add or modify pre-requisite component you need to regenerate the Azure solution package

you will have to pay the bandwidth charge of transferring the regenerated solution package (perhaps 100s of MB) even though you actually want to update just one small component

the time to update entire role instance is incremented by the time it takes to transfer the solution package to Azure datacenter

you cannot update individual component rather you update entire package

Below I describe an alternative approach. It is based on the idea to leverage blob storage to store the pre-requisite components. Decoupling of pre-requisite components (in most cases they have no relationship with the Azure role implementation) has a number of benefits:

you do not need to touch Azure solution package, simply upload a new component to blob container with a tool like Windows Azure MMC

you can update an individual component

you only pay a bandwidth cost for component you are uploading, not the entire package

your time to update the role instance is shorter because you are not transferring entire solution package

Here is how the solution is put together:

When Azure role starts, it downloads the components from a blob container that is defined in the .cscfg configuration file
<Setting name="DeploymentContainer" value="contoso" />
The components are downloaded to a local disk of the Azure role. Sufficient disk space is reserved for a Local Resource disk defined in the .csdef definition file of the solution – in my case I reserve 2GB of disk space
<LocalStorage name="TempLocalStore" cleanOnRoleRecycle="false" sizeInMB="2048" />
Frequently, the pre-requisite components are installers (self extracting executables or .msi files). In most cases they will use some form of temporary storage to extract the temporary files. Most installers allow you to specify the location for temporary files but in case of legacy or undocumented third party components you may not have this option. Frequently, the default location would the directory indicated by %TEMP% or %TMP% environment variables. There is a 100MB limit on the size of the TEMP target directory that is documented in Windows Azure general troubleshooting MSDN documentation.

To avoid this issue I implemented the mapping of the TEMP/TMP environment variables as indicated in this document. These variables point to the local disk we reserved above.
private void MapTempEnvVariable()
       {
           string customTempLocalResourcePath =
           RoleEnvironment.GetLocalResource("TempLocalStore").RootPath;
           Environment.SetEnvironmentVariable("TMP", customTempLocalResourcePath);
           Environment.SetEnvironmentVariable("TEMP", customTempLocalResourcePath);
       }
The OnStart() method of the role (in my case it is a “mixed Role” - Web Role that also implements a Run() method) starts the download procedure on a different thread and then blocks on a wait handle.
public class WebRole : RoleEntryPoint
    {
        private readonly EventWaitHandle statusCheckWaitHandle = new ManualResetEvent(false);
        private volatile bool busy = true;
        private const int ThreadPollTimeInMilliseconds = 1500;
       
       //periodically check if the handle is signalled
       private void WaitForHandle(WaitHandle handle)
       {
           while (!handle.WaitOne(ThreadPollTimeInMilliseconds))
           {
               Trace.WriteLine("Waiting to complete configuration for this role .");
           }
       }
[...]
         public override bool OnStart()
           {
           //start another thread that will carry would configuration
            var startThread = new Thread(OnStartInternal);
            startThread.Start();
         [...]
           //Blocks this thread waiting on a handle 
            WaitForHandle(statusCheckWaitHandle);           
            return base.OnStart();
        }
}
If you block in OnStart() or continue depends on the nature of the pre-requisite components and where they are used. If you need your components to be installed before you enter Run() – you should block because Azure will call this method pretty much immediately after you exit OnStart(). On the other hand if you have a pure Web role than you could choose not to block. Instead, you would handle the StatusCheck event and indicate that you are still busy configuring they role – the role will be in Busy state and will be taken out of the NLB rotation.
private void RoleEnvironmentStatusCheck(object sender, RoleInstanceStatusCheckEventArgs e){
           if (this.busy)
           {
               e.SetBusy();
               Trace.WriteLine("Role status check-telling Azure this role is BUSY configuring, traffic will not be routed to it.");
           }
           // otherwise role is READY
           return;
       }
If you choose to block OnStart() there is little point to implement the StatusCheck event handler because when OnStart() is entered the role is in Busy state and the status check event is not raised until OnStart() exits and role moves to Ready state.

The function OnStartInternal() is where the components are downloaded from the preconfigured Blob container and written to local disk that we allocated above. When downloading the files we look for a file named command.bat.

This command batch file holds commands that we will run after the download completes in the similar way as we do it with startup tasks. Here the sample command.bat
vcredist_x64.exe /q
exit /b 0
I should point out that if you are dealing with large files you should definitely specify the Timeout property in the BlobRequestOptions for the DownloadToFile() function. The timeout is calculated as a product of the size of the blob in MB and the max. time that we are prepared to wait to transfer 1MB. You can tweak timeoutInSecForOneMB variable to adjust the code to the network conditions. Any value that works for you when you deploy solution to Compute Emulator and use the Azure blob store will also work when you deploy to Azure Compute because you would use blob storage in the same datacenter and the latencies would be much lower.
foreach (CloudBlob blob in container.ListBlobs(new BlobRequestOptions() { UseFlatBlobListing = true }))
                {
                    string blobName = blob.Uri.LocalPath.Substring(blob.Parent.Uri.LocalPath.Length);
                    if (blobName.Equals("command.bat"))
                    {
                        commandFileExists = true;
                    }
                    string file = Path.Combine(deployDir, blobName);

                    Trace.TraceInformation(System.DateTime.UtcNow.ToUniversalTime() + " Downloading blob " + blob.Uri.LocalPath + " from deployment container " + container.Name + " to file " + file);

                    //the time in seconds in which 1 MB of data should be downloaded (otherwise we timeout)
                    int timeoutInSecForOneMB = 5;
                    double sizeInMB = System.Math.Ceiling((double)blob.Properties.Length / (1024 * 1024));

                    // we set large enough timeout to download the blob of the given size
                    blob.DownloadToFile(file, new BlobRequestOptions() { Timeout = TimeSpan.FromSeconds(timeoutInSecForOneMB * sizeInMB) });
                    FileInfo fi = new FileInfo(file);
                    Trace.TraceInformation(System.DateTime.UtcNow.ToUniversalTime() + " Saved blob to file " + file + "(" + fi.Length + " bytes)");
                }
After download has been completed the RunProcess() function spawns a child process to run the batch. The runtime process host WaIISHost.exe is configured to run elevated. This is done in the service definition file using the Runtime element. This is required so that the batch commands have admin rights (the process will run under Local System account)
<Runtime executionContext="elevated">     
    </Runtime>
Then we signal the wait handle (if implemented) and /or set busy flag that is used with the StatusCheck event to indicate configuration has completed and role can move to Ready state.
Trace.WriteLine("---------------Starting Deployment Batch Job--------------------");
                if (commandFileExists)
                    RunProcess("cmd.exe", "/C command.bat");
                else
                    Trace.TraceWarning("Command file command.bat was not found in container " + container.Name);

                Trace.WriteLine(" Deployment completed.");

                // signal role is Ready and should be included in the NLB rotation
                this.busy = false;
                //signal the wait handle to unblock OnStart() thread - if waiting has been implemented.
                statusCheckWaitHandle.Set();
The sample solution includes a diagnostics.wadcfg file that transfers the traces and events to blob store (you want to delete or rename it once you go to production to avoid associated storage costs) every 60 seconds.
<?xml version="1.0" encoding="utf-8"?>
<DiagnosticMonitorConfiguration xmlns="http://schemas.microsoft.com/ServiceHosting/2010/10/DiagnosticsConfiguration" configurationChangePollInterval="PT5M" overallQuotaInMB="3005">
  <Logs bufferQuotaInMB="250" scheduledTransferLogLevelFilter="Verbose" scheduledTransferPeriod="PT01M" />
  <WindowsEventLog bufferQuotaInMB="50"
    scheduledTransferLogLevelFilter="Verbose"
    scheduledTransferPeriod="PT01M">
    <DataSource name="System!*" />
    <DataSource name="Application!*" />
  </WindowsEventLog>  
</DiagnosticMonitorConfiguration>
You can then deploy the solution to Azure Compute and look up the traces in the WADLogsTable using any Azure storage access tool such as Azure MMC.
Entering OnStart() - 03/06/2011 9:09:51
Blocking in OnStart() - waiting to be signalled.03/06/2011 9:09:51
OnStartInternal() - Started 03/06/2011 9:09:51
Waiting to complete configuration for this role .
---------------Downloading from Storage--------------------
03/06/2011 9:09:53 Downloading blob /contoso/command.bat from deployment container contoso to file C:\Users\micham\AppData\Local\dftmp\s0\deployment(688)\res\deployment(688).StorageDeployment.StorageDeployWebRole.0\directory\TempLocalStore\command.bat; TraceSource 'WaIISHost.exe' event
03/06/2011 9:09:53 Saved blob to file C:\Users\micham\AppData\Local\dftmp\s0\deployment(688)\res\deployment(688).StorageDeployment.StorageDeployWebRole.0\directory\TempLocalStore\command.bat(36 bytes); TraceSource 'WaIISHost.exe' event
03/06/2011 9:09:53 Downloading blob /contoso/vcredist_x64.exe from deployment container contoso to file C:\Users\micham\AppData\Local\dftmp\s0\deployment(688)\res\deployment(688).StorageDeployment.StorageDeployWebRole.0\directory\TempLocalStore\vcredist_x64.exe; TraceSource 'WaIISHost.exe' event
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
03/06/2011 9:10:01 Saved blob to file C:\Users\micham\AppData\Local\dftmp\s0\deployment(688)\res\deployment(688).StorageDeployment.StorageDeployWebRole.0\directory\TempLocalStore\vcredist_x64.exe(5718872 bytes); TraceSource 'WaIISHost.exe' event
---------------Starting Deployment Batch Job--------------------
Information: Executing: cmd.exe
; TraceSource 'WaIISHost.exe' event
C:\Users\micham\AppData\Local\dftmp\s0\deployment(688)\res\deployment(688).StorageDeployment.StorageDeployWebRole.0\directory\TempLocalStore>vcredist_x64.exe /q ; TraceSource 'WaIISHost.exe' event
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
Waiting to complete configuration for this role .
; TraceSource 'WaIISHost.exe' event
; TraceSource 'WaIISHost.exe' event
C:\Users\micham\AppData\Local\dftmp\s0\deployment(688)\res\deployment(688).StorageDeployment.StorageDeployWebRole.0\directory\TempLocalStore>exit /b 0 ; TraceSource 'WaIISHost.exe' event
; TraceSource 'WaIISHost.exe' event
Information: Process Exit Code: 0
Process execution time: 13,151 seconds.
 Deployment completed.
Exiting OnStart() - 03/06/2011 9:10:14
Information: Worker entry point called
Information: Working
If you need to redeploy the component you just upload the changed component versions to blob storage and restart your role to pick it up. When you combine this technique with the Web Deploy to update the code of Azure web role you should notice that the whole update process is faster and more flexible. This approach would be most useful in development and testing where otherwise you redeploy the solution multiple times.

You can download the solution here.

Joseph Fultz wrote Multi-Platform Windows Azure Storage for smartphones in MSDN Magazine’s June 2011 issue:

Download the Code Sample

Windows Azure Storage is far from a device-specific technology, and that’s a good thing. This month, I’ll take a look at developing on three mobile platforms: Windows Phone 7, jQuery and Android.

To that end, I’ll create a simple application for each that will make one REST call to get an image list from a predetermined Windows Azure Storage container and display the thumbnails in a filmstrip, with the balance of the screen displaying the selected image as seen in Figure 1.

Figure 1 Side-by-Side Storage Image Viewers

Preparing the Storage Container

I’ll need a Windows Azure Storage account and the primary access key for the tool I use for uploads. In the case of secure access from the client I’m developing I would also need it. That information can found in the Windows Azure Platform Management Portal.

I grabbed a few random images from the Internet and my computer to upload. Instead of writing upload code, I used the Windows Azure Storage Explorer, found at azurestorageexplorer.codeplex.com. For reasons explained later, images need to be less than about 1MB. If using this code exactly, it’s best to stay at 512KB or less. I create a container named Filecabinet; once the container is created and the images uploaded, the Windows Azure piece is ready.

Platform Paradigms

Each of the platforms brings with it certain constructs, enablers and constraints. For the Silverlight and the Android client, I took the familiar model of marshaling the data in and object collection for consumption by the UI. While jQuery has support for templates, in this case I found it easier to simply fetch the XML and generate the needed HTML via jQuery directly, making the jQuery implementation rather flat. I won’t go into detail about its flow, but for the other two examples I want to give a little more background.

Windows Phone 7 and Silverlight

If you’re familiar with Silverlight development you’ll have no problem creating a new Windows Phone 7 application project in Visual Studio 2010. If not, the things you’ll need to understand for this example are observable collections (bit.ly/18sPUF), general XAML controls (such as StackPanel and ListBox) and the WebClient.

At the start, the application’s main page makes a REST call to Windows Azure Storage, which is asynchronous by design. Windows Phone 7 forces this paradigm as a means to ensure that any one app will not end up blocking and locking the device. The data retrieved from the call will be placed into the data context and will be an ObservableCollection<> type.

This allows the container to be notified and pick up changes when the data in the collection is updated, without me having to explicitly execute a refresh. While Silverlight can be very complex for complex UIs, it also provides a low barrier of entry for relatively simple tasks such as this one. With built-in support for binding and service calls added to the support for touch and physics in the UI, the filmstrip to zoom view is no more difficult than writing an ASP.NET page with a data-bound Grid.

Android Activity

Android introduces its own platform paradigm. Fortunately, it isn’t something far-fetched or hard to understand. For those who are primarily .NET developers, it’s easy to map to familiar constructs, terms and technologies. In Android, pretty much anything you want to do with the user is represented as an Activity object. The Activity could be considered one of the four basic elements that may be combined to make an application: Activity, Service, Broadcast Receiver and Content Provider.

For simplicity, I’ve kept all code in Activity. In a more realistic implementation, the code to fetch and manipulate the images from Windows Azure Storage would be implemented in a Service.

This sample is more akin to putting the database access in the code behind the form. From a Windows perspective, you might think of an Activity as a Windows Forms that has a means—in fact, the need—to save and restore state. An Activity has a lifecycle much like Windows Forms; an application could be made of one-to-many Activities, but only one will be interacting with the user at a time. For more information on the fundamentals of Android applications, go to bit.ly/d3c7C.

This sample app will consist of a single Activity (bit.ly/GmWui), a BaseAdapter (bit.ly/g0J2Qx) and a custom object to hold information about the images.

Creating the UIs

The common tasks in each of the UI paradigms include making the REST call, marshaling the return to a useful datatype, binding and displaying, and handling the click event to show the zoom view of the image. First, I want to review the data coming back from the REST Get, and what’s needed for each example to get the data to something more easily consumable. …

Joseph continues with a detailed description of the required code for Silverlight and Android applications.

Joseph is a software architect at AMD, helping to define the overall architecture and strategy for portal and services infrastructure and implementations. Previously he was a software architect for Microsoft working with its top-tier enterprise and ISV customers defining architecture and designing solutions.

Full disclosure: 1105 Media publishes MSDN Magazine and I’m a contributing editor of their Visual Studio Magazine.

See also Roger Struckhoff (@struckhoff) announced “Microsoft's Brian Prince Outlines Strategy at Cloud Expo” in his Microsoft Leverages Cloud Storage for Massive Scale post of 6/3/2011 to the Cloud Computing Journal article in the Cloud Computing Events section.

SQL Azure Database, Data Sync, Compact and Reporting

•• See also The Windows Azure Connect Team explained Speeding Up SQL Server Connections in a 6/3/2011 post article in the Windows Azure VM Role, Virtual Network, Connect, RDP and CDN section below.

• Kenneth Chestnut wrote Getting in a Big Data state of mind on 6/3/2011 for SD Times on the Web:

One of the most hyped terms in information management today is Big Data. Everyone seems excited by the concept and related possibilities, but is the "Big" in Big Data mere[ly] a state of mind? Yes, it is.

Organizations are realizing the potential challenges resulting from the explosion of unstructured information: e-mails, images, log files, cables, user-generated content, documents, videos, blogs, contracts, wikis, Web content... the list goes on. Traditional technologies and information management practices may no longer prove sufficient given today’s information environment. Therefore, forward-looking organizations and individuals are evaluating new technologies and concepts like Big Data in an attempt to address the unstructured information overload that is currently underway. Some forward-thinking organizations even view these challenges as opportunities to derive new insight and gain competitive advantage.

On the other side of the equation, some technology vendors are equally excited by the potential of Big Data as a Big Market Opportunity. Storage vendors focus on the sheer volume of Big Data (petabytes of information giving way to exabytes and eventually zettabytes) and the need for organizations to have efficient and comprehensive storage for all of that information. Data warehousing and business intelligence vendors emphasize the need for advanced statistical and predictive analytical capabilities to sift through the vast volume of information to make sense of it, and to find the proverbial needle in the haystack, typically using newer technologies such as MapReduce and Hadoop, which are cheaper than previous technologies. And so on.

While advanced organizations see tremendous opportunity for harnessing Big Data, how do they start addressing both the challenges and the potential opportunities given the numerous definitions and confusion that exists in the marketplace today? What if I belong to an organization that doesn’t have information in the petabytes (going to exabytes) scale? Does Big Data apply to me—or you?

This is why Big Data can be considered a state of mind rather than something specific. It becomes a concept that is applicable to any organization that feels that current tools, technologies, and processes are no longer sufficient for managing and taking advantage of their information management needs, regardless of whether its data is measured in gigabytes, terabytes, exabytes or zettabytes. …

Read more: Pages 2, 3, 4

Kenneth is vice president of product marketing at MarkLogic, which sells Big Data solutions.

Erik Ejlskov Jensen (@ErikEJ) posted Populating a Windows Phone “Mango” SQL Server Compact database on desktop on 6/2/2011:

If you want to prepopulate a Mango SQL Server Compact database with some data, you can use the following procedure to do this. (Notice, that the Mango tools are currently in beta)

First, define your data context, and run code to create the database on the device/emulator, using the CreateDatabase method of the DataContext. See a sample here. This will create the database structure on your device/emulator, but without any initial data.

Then use the Windows Phone 7 Isolated Storage Explorer to copy the database from the device to your desktop, as described here.

You can now use any tool, see the list of third party tools on this blog, to populate your tables with data as required. The file format is version 3.5. (not 4.0)

Finally, include the pre-populated database in your WP application as an Embedded Resource.

You can then use code like the following to write out the database file to Isolated Storage on first run:

public class Chinook : System.Data.Linq.DataContext
{
    public static string ConnectionString = "Data Source=isostore:/Chinook.sdf";

    public static string FileName = "Chinook.sdf";

    public Chinook(string connectionString) : base(connectionString) { }

    public void CreateIfNotExists()
    {
        using (var db = new Chinook(Chinook.ConnectionString))
        {
            if (!db.DatabaseExists())
            {
                string[] names = this.GetType().Assembly.GetManifestResourceNames();
                string name = names.Where(n => n.EndsWith(FileName)).FirstOrDefault();
                if (name != null)
                {
                    using (Stream resourceStream = Assembly.GetExecutingAssembly().GetManifestResourceStream(name))
                    {
                        if (resourceStream != null)
                        {
                            using (IsolatedStorageFile myIsolatedStorage = IsolatedStorageFile.GetUserStoreForApplication())
                            {
                                using (IsolatedStorageFileStream fileStream = new IsolatedStorageFileStream(FileName, FileMode.Create, myIsolatedStorage))
                                {
                                    using (BinaryWriter writer = new BinaryWriter(fileStream))
                                    {
                                        long length = resourceStream.Length;
                                        byte[] buffer = new byte[32];
                                        int readCount = 0;
                                        using (BinaryReader reader = new BinaryReader(resourceStream))
                                        {
                                            // read file in chunks in order to reduce memory consumption and increase performance                    
                                            while (readCount < length)
                                            {
                                                int actual = reader.Read(buffer, 0, buffer.Length);
                                                readCount += actual;
                                                writer.Write(buffer, 0, actual);
                                            }
                                        }
                                    }

                                }
                            }
                        }

                        else
                        {
                            db.CreateDatabase();
                        }
                    }
                }
                else
                {
                    db.CreateDatabase();
                }
            }
        }
    }
}

[Please] with (c) (@pleasewithc) described Working with pre-populated SQL CE databases in WP7 in a 6/2/2011 post:

[Don't miss the update below]

When I put my first pre-populated database file into a Mango project & ran it, I was unable to access the distributed database:

“Access to the database file is not allowed.”

Oddly, this error doesn’t appear until you actually attempt to select a record from the database.

When I originally created this database in my DB generator app, I used the following URI as a connection string:

“isostore:/db.sdf”

As you might expect this saves the database to the Isolated Storage of the DB generator app. I point this out because after this database is extracted from the DB generator app, added to your “real” project & deployed to a phone, the database file gets written into a read-only storage area which you can connect to via the appdata:/ URI:

“appdata:/db.sdf”

Although the DataContext constructs without a problem, using the appdata:/ URI, attempting to select records from this database threw the file access error above.

I wanted to rule out whether my database had actually been deployed at all, so I tried connecting to a purposefully bogus file:

Which gave me a different error (file not found), so I knew the database had been deployed and was in the expected location; but strangely, it looks like DataContext / LINQ to SQL doesn’t let me select records while the file resides in the app deployment area. Which effectively renders the “appdata:/” connection URI somewhat pointless :P

What you can do is copy the file out of appdata & into isostore:

This may be something to keep in mind, esp. if you planned to deploy a massive database with your app (although distributing a heavy .xap to the marketplace probably isn’t very advisable in the first place).

Arsanth’s database is only ~200KB at the moment, so producing a working copy is a trivial expense in terms of both storage space & IO time. I was just slightly surprised I actually had to this in order to select against a pre-populated database distributed with the app, so be aware of this gotcha.

Update

Thanks to @gcaughey who pointed out connection string file access modes, which I was not aware of yet.

So instead of copying to Isolated Storage, I tried this connection string:

“datasource=’appdata:/db.sdf’; mode=’read only’”

And sure enough I can select records directly from the distributed database. Right on! If you wanted full CRUD & persistence to your distributed database, you can always create copy as described; but if all you need to do is select from the distributed database, you can set file access modes in the connection string.

Bob Beauchemin (@bobbeauch) reported an Interesting SQL Azure change with next SR on 6/1/2011:

In looking at the what's new for SQL Azure (May 2011) page, I came across the following: "Upcoming Increased Precision of Spatial Types: For the next major service release, some intrinsic functions will change and SQL Azure will support increased precision of Spatial Types."

There's a few interesting things about this announcement.

Firstly, the increased precision for spatial types is not a SQL Server 2008 R2 feature. It's a Denali CTP1 feature. Although the article doesn't indicate whether they've made up a special "pre-Denali" version of this feature, or when exactly "the next major service release" will be (and when SQL Server Denali will be released is unknown), it would be interesting if updated SQL Server spatial functionality made its appearance in SQL Azure *before* making its appearence in an on-premise release of SQL Server. As far as I know, this will be the first time a new, non-deprecation feature is deployed in the cloud before on-premise (non-deprecation because, for example, the COMPUTE BY clause fails in SQL Azure but not in any on-premise RTM release of SQL Server). Note that usage of SQL Server "opaque" features (for example, are instances managed internally be a variant of the Utility Control Point concept?) cannot be determined.

In addition, this may be the first "impactful change" (BOL doesn't say breaking change, but change with a possible impact, but one never knows what the impact would be in other folks' applications) in SQL Azure Database. The BOL entry continues "This will have an impact on persisted computed columns as well as any index or constraint defined in terms of the persisted computed column. With this service release SQL Azure provides a view to help determine objects that will be impacted by the change. Query sys.dm_db_objects_impacted_on_version_change (SQL Azure Database) in each database to determine impacted objects for that database."

Here's a couple of object definitions that will populate this DMV:

create table spatial_test (
id int identity primary key,
geog geography,
area as geog.STArea() PERSISTED,
);

-- one row, class_desc INDEX, for the clusted index
select * from sys.dm_db_objects_impacted_on_version_change

ALTER TABLE spatial_test
ADD CONSTRAINT check_area CHECK (area > 50);

-- two more rows, class_desc OBJECT_OR_COLUMN, for the constraint object
select * from sys.dm_db_objects_impacted_on_version_change

Before this, the SQL Azure Database koan was "Changes are always backward-compatible". There is now the sys.dm_db_objects_impacted_on_version_change DMV and the BOL page for it even provides sample DDL to handle the impacted objects. But this begs the question: I can run the DMV to determine objects that would be impacted and fix them when the change occurs, but if I don't know when the SU will be released, how can I plan/stage my app change to corespond? Interesting times ahead...

MarketPlace DataMarket and OData

••• Lohith Kashyapa (@kashyapa) described Performing CRUD on OData Services using DataJS in a 6/5/2011 post:

Recently i had released a project by name Netflix Catalog using HTML + ODATA + DATAJS + JQUERY. I had a request from a reader asking for a CRUD example using DataJS. This post is about a demo application for performing Create, Read, Update and Delete on a OData service all by just using the DataJS client side framework. I have become a big fan of DataJS and i will do whatever it takes to promote it. So this post is kind of a walkthrough and a step by step guide for using DataJS and performing CRUD with the help of it. So stay with me and follow carefully.

Pre-Requisites:

To follow along with this demo, you do not need Visual Studio.NET licenced version like Developer or Professional or Ultimate edition. As far as possible i tend to create my demos and example using the free express versions available. So here are the pre-requisites:

Microsoft Visual Web Developer 2010 Express

Internet Information Services (IIS) 7.5 Express

I prefer IIS Express because, its as if you are developing on a production level IIS but which runs locally on your desktop as a user services rather than a machine level service. More information on this from Scott Gu here.

Also i am using SQL CE or SQL Server Compact Edition – as my backend store. My EF Model is tied to this database. But if you want to replace it with SQL Express you can. You need to change the connection string EF data model in the web.config. But if you want to pursue my path then you need to do the following:

Install VS 2010 SP1

SQL CE tools for Visual Studio

Note: Follow the order of installation.

If you have finished the pre-requisite part, then now the real fun begins. Read on:

Step 1: Create ASP.NET Web Site

Fire up Visual Web Developer 2010 express and select File > New Web Site. Select the “Visual C#” from the installed templates and “ASP.NET Web Site” from the project template. Select File System for the web location and provide a path for the project. Click Ok to create the project.

Note: I will be developing this demo using C# as a language.

Step 2: Clean Up

Once the project is created, it will have some default files and folder system as shown below.

For this demo we do not need any of those. So lets clean up those files and folder. I will be deleting the “Account” folder, “About.aspx”, “Default.aspx” and “Site.master”. We will need the Scripts, App_Data, Styles folders. So lets leave them as is for now. Here is how the solution looks after clean up:

Step 3: Set Web Server

By default when you create any web site projects, Visual Studio will use the “Visual Studio Development Server”. In order to use IIS Express as the web server for the project, right click on the solution and select the option “Use IIS Express”.

Step 4: Create SQL CE Database

As i have said earlier, i will be using SQL CE as the back end store. So lets create a simple database for maintaining User account details – userid, username, email, and password. Right click on the solution and select “Add New Item”. I have named my database as “users”. Note that the extension for the data file is .SDF and it will be stored as a file in your App_Data folder. So if you want to deploy this somewhere you can just zip and send this and app will be up and running after installation.

Note: You will get the following warning when adding the database. Select “Yes”. All it says is we are adding a special file and will place it in App_Data folder.

Step 5: Create Table

Lets create a new table in the database. It will be a simple table with 4 columns. namely:

UserID which is the PK and Identity column

UserName, string column

Email, string column

Password, string column

Expand App_Data folder and double click on Users.sdf file. This will open the Database Explorer with a connection to Users database.

Right click on Tables and select “Create Table”. You will get the table creation UI. My table definition looks like below:

Step 6: Create Entity Framework Model

Right click on solution and select “Add New Item”. Select “ADO.NET Entity Data Model” item type. I have named my model as “UsersModel.edmx”.

When asked to choose Model contents select “Generate From Database” and click next.

Next you will have to choose the data connection. By default the Users.sdf that we created will be listed in the database listing. Give name for the entities and click next.

Next you would have to choose the database objects. Expand Tables and select Users. Give a name for the Model Namespace. In my case i have named it as UsersModel. Click finish.

After this, UsersModel.edmx will be added to the App_Code folder. So we now have the EDMX ready for creating an OData service.

Step 7: Create OData Service

Now we will create a OData service to expose our Users data. Right click on the Solution and select “Add New Item”. Select WCF Data Service from the item template. Give it a name – in my case i have named it as “UsersOData.svc”.

The SVC file will be kept at the root folder where as the code behind for that will be placed in App_Code folder. At this point here is how our solution looks like:

Open the UsersOData.cs file from the App_Code folder. You will see the following code set up for you.
   1:  public class UsersOData : DataService< /* TODO: put your data source class name here */ >
   2:  {
   3:      // This method is called only once to initialize service-wide policies.
   4:      public static void InitializeService(DataServiceConfiguration config)
   5:      {
   6:          // TODO: set rules to indicate which entity sets and service operations are visible, updatable, etc.
   7:          // Examples:
   8:          // config.SetEntitySetAccessRule("MyEntityset", EntitySetRights.AllRead);
   9:          // config.SetServiceOperationAccessRule("MyServiceOperation", ServiceOperationRights.All);
  10:          config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2;
  11:      }
  12:  }
If you see the highlighted portion, there is a TODO and it says to put our data source class name here. Remember from the step where we added the Entity Framework, you need to put the EF entity container name here. In my case i had named it as UsersEntities. Also we need to set some entity access rules. After those changes here is the modified code:
   1:  public class UsersOData : DataService<UsersEntities>
   2:      {
   3:          // This method is called only once to initialize service-wide policies.
   4:          public static void InitializeService(DataServiceConfiguration config)
   5:          {
   6:              config.SetEntitySetAccessRule("*", EntitySetRights.All);
   7:              config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2;
   8:              config.UseVerboseErrors = true;
   9:          }
  10:      }
At this point if you run the application and navigate to UsersOData.SVC you will receive an error which is:

Well it says it cant fine the type that implements the service. Thanks to the tip provided at the following location – http://ef4templates.codeplex.com/documentation – came to know that the .SVC file has a attribute called Service and it points to the implementation class which is in our class UsersOData.cs. But we need to qualify the class with a namespace and provide the fully classified name to the Service attribute. Open UsersOData.cs file and wrap the class inside a namespace as below:
   1:  namespace UserServices
   2:  {
   3:      public class UsersOData : DataService<UsersEntities>
   4:      {
   5:          // This method is called only once to initialize service-wide policies.
   6:          public static void InitializeService(DataServiceConfiguration config)
   7:          {
   8:              config.SetEntitySetAccessRule("*", EntitySetRights.All);
   9:              config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2;
  10:              config.UseVerboseErrors = true;
  11:          }
  12:      }
  13:  }
Next open the UsersOData.SVC file and modify the Service attribute to include the Namespace and the class name as below:
   1:  <%@ ServiceHost 
   2:  Language="C#" 
   3:  Factory="System.Data.Services.DataServiceHostFactory" 
   4:  Service="UserServices.UsersOData" %>
If you now navigate to UsersOData.svc in internet explorer you should see the following:

So we have our OData service up and running with one collection “Users”. Next we will see how we will do a CRUD on this service using the DataJS.

Step 8: Create Index.htm, Import DataJS & JQuery

Now lets create the front end for the application. Right click on the solution and select “Add New Item”. Select a html page and name it as “Index.htm”. I will be using JQuery from the AJAX CDN from Microsoft. The project gets bundled with JQuery 1.4.1. But i prefer using the CDN as i get the latest versions hosted at one of most reliable place on the net. Here is the URL for the JQuery:
   1:  <script 
   2:  type="text/javascript" 
   3:  src="http://ajax.aspnetcdn.com/ajax/jquery/jquery-1.5.1.min.js">
   4:  </script>
   5:  <script 
   6:  type="text/javascript" 
   7:  src="http://ajax.aspnetcdn.com/ajax/jquery.ui/1.8.13/jquery-ui.min.js">
   8:  </script>
Next download the DataJS library from the following location: http://datajs.codeplex.com/

Next we need one more JQuery plugin for templating. Download the Jquery Templating library from: https://github.com/jquery/jquery-tmpl.

Let me give you a sneak preview of what we are building. We will have a <table> which lists down all the available user accounts, ability to create a new user account, ability to update an existing account and ability to delete a user account.

I will not get into the UI mark-up now. You can download the code and go through it. Lets start looking at how DataJS was used for the CRUD operation.

Read Operation:

DataJS has a “read” API. This will allow us to read the data from the OData service. In its simplest form Read API takes the URL and a call-back to execute when the read operation succeeds. In this example, what i have done is i have a GetUsers() function and GetUsersCallback() function which handle the read part. Here is the code:
   1:  //Gets all the user accounts from service
   2:  function GetUsers() 
   3:  {
   4:      $("#loadingUsers").show();
   5:      OData.read(USERS_ODATA_SVC, GetUsersCallback);
   6:  }
   7:  
   8:  //GetUsers Success Callback
   9:  function GetUsersCallback(data, request) 
  10:  {
  11:      $("#loadingUsers").hide();
  12:      $("#users").find("tr:gt(0)").remove();
  13:      ApplyTemplate(data.results)
  14:  }
ApplyTemplate() is a helper function which takes the JSON data uses JQuery templating technology to create the UI markup and append it to the user account listing table.

Create Operation:

If “read” API allows to get the data, DataJS has a “request” API which allows us to do the CUD or Create , Update and Delete operations. There is no specific calls like Create/Update/Delete in OData, rather we use the HTTP verbs to indicate the server what operation to perform. Create uses “POST” verb, Update uses “PUT” verb and Delete uses “DELETE” verb. As you have seen earlier, i have 3 fields on the User entity. In the example while creating a new entity, i create a JSON representation of the data i need to send over to the server and pass it to the request API. The code for the create is as follows:
   1:  //Handle the DataJS call for new user acccount creation
   2:  function AddUser() 
   3:  {
   4:      $("#loading").show();
   5:      var newUserdata = { 
   6:      username: $("#name").val(), 
   7:      email: $("#email").val(), 
   8:      password: $("#password").val() };
   9:      var requestOptions = {
  10:          requestUri: USERS_ODATA_SVC,
  11:          method: "POST",
  12:          data: newUserdata
  13:      };
  14:  
  15:      OData.request(requestOptions, 
  16:      AddSuccessCallback, 
  17:      AddErrorCallback);
  18:  
  19:  }
  20:  
  21:  //AddUser Success Callback
  22:  function AddSuccessCallback(data, request) 
  23:  {
  24:      $("#loading").hide('slow');
  25:      $("#dialog-form").dialog("close");
  26:      GetUsers();
  27:  }
  28:  
  29:  //AddUser Error Callback
  30:  function AddErrorCallback(error) 
  31:  {
  32:      alert("Error : " + error.message)
  33:      $("#dialog-form").dialog("close");
  34:  }
As you can see, the high lighted code is where i create the JSON representation of the data that needs to be passed. In this case its username, email and password. Also notice the method set to “POST”. The request API takes request options, a success call-back and an error call-back. In the success call-back i go and fetch all the users again and re paint the UI.

Update Operation:

If we want to update a resource which is exposed as an OData we follow the same construct we used in Create. But the only change is the URL you will need to point to. If i have a User whose UserID (in this example the PK) is 1, then the URL to point to this user is /UsersOData.svc/Users(1">/UsersOData.svc/Users(1">/UsersOData.svc/Users(1">http://<server>/UsersOData.svc/Users(1). If you notice we pass the user id in the URL itself. This is how you navigate to a resource in OData. To update this resource, we need to again prepare the data in JSON representation and call the request API but with method set to “PUT”. PUT is the http verb that lets the server know that we need to update the resource with the new values passed. Here is the code block to handle the Update:
   1:  //Handle DataJS calls to Update user data
   2:  function UpdateUser(userId) 
   3:  {
   4:      $("#loading").show();
   5:      var updateUserdata = {
   6:       username: $("#name").val(), 
   7:       email: $("#email").val(), 
   8:       password: $("#password").val() };
   9:      var requestURI = USERS_ODATA_SVC + "(" + userId + ")";
  10:      var requestOptions = {
  11:          requestUri: requestURI,
  12:          method: "PUT",
  13:          data: updateUserdata
  14:      };
  15:  
  16:      OData.request(requestOptions, 
  17:      UpdateSuccessCallback, 
  18:      UpdateErrorCallback);
  19:  
  20:  }
  21:  
  22:  //UpdateUser Suceess callback
  23:  function UpdateSuccessCallback(data, request) {
  24:      $("#loading").hide('slow');
  25:      $("#dialog-form").dialog("close");
  26:      GetUsers();
  27:  }
  28:  
  29:  //UpdateUser Error callback
  30:  function UpdateErrorCallback(error) {
  31:      alert("Error : " + error.message)
  32:      $("#dialog-form").dialog("close");
  33:  }
Delete Operation:

Delete operation is very similar to Update. The only difference is we wont pass any data. Instead we set the URL to the resource that needs to be deleted and mark the method as “DELETE”. DELETE is the HTTP verb which the server will look for if a resource needs to be deleted. For .e.g if i have a user id 1 and i want to delete that resource, we set the URL as /UsersOData.svc/Users(1">/UsersOData.svc/Users(1">/UsersOData.svc/Users(1">http://<server>/UsersOData.svc/Users(1) and set the method as “DELETE”. Here is the code for the same:
   1:  //Handles DataJS calls for delete user
   2:  function DeleteUser(userId) 
   3:  {
   4:      var requestURI = USERS_ODATA_SVC + "(" + userId + ")";
   5:      var requestOptions = {
   6:                              requestUri: requestURI,
   7:                              method: "DELETE",
   8:                          };
   9:  
  10:      OData.request(requestOptions, 
  11:      DeleteSuccessCallback, 
  12:      DeleteErrorCallback);
  13:  }
  14:  
  15:  //DeleteUser Success callback
  16:  function DeleteSuccessCallback()
  17:  {
  18:      $dialog.dialog('close');
  19:      GetUsers();
  20:  }
  21:  
  22:  //DeleteUser Error callback
  23:  function DeleteErrorCallback(error)
  24:  {
  25:      alert(error.message)
  26:  }
That’s it. With a single HTML file, with DataJS javascript library and a little bit of JQuery to spice up the UI we built a complete CRUD operations within just a couple of minutes.

DataJS is now getting traction and many people have started to use this and follow this library. Hopefully this demo would give a pointer to all those who wanted to know more on the request API. Main goal was to focus on request API of dataJS and how to use that. Hope i have done the justice to that.

Find attached the source code here.

Till next time, as usual, Happy Coding. Code with passion, Decode with patience.

•• MSDN Courses reported the availability of an Accessing Cloud Data with Windows Azure Marketplace course on 6/1/2011:

Description

Working in the cloud is becoming a major initiative for application development departments. Windows Azure is Microsoft's cloud incorporating data and services. This lab will guide the reader through a series of exercises that creates a Silverlight Web Part that displays Windows Azure Marketplace data on a Silverlight Bing map control.

Overview

Working in the cloud is becoming a major initiative for application development departments. Windows Azure is Microsoft’s cloud incorporating data and services. This lab will guide the reader through a series of exercises that creates a Silverlight Web Part that displays Windows Azure Marketplace data on a Silverlight Bing map control.

Exercise 1: Subscribe to a Microsoft Windows Azure Marketplace dataset

Exercise 2: Create a Business Data Catalog Model to Access the Dataset

Exercise 3: Create an External List to Consume the MarketPlace Dataset

Exercise 4: Create a Web Part to Display the MarketPlace Data

Exercise 5: Deploy the Web Part

Summary

Objectives

This lab will demonstrate how you can consume Windows Azure data using SharePoint 2010 and a Silverlight Web Part. To demonstrate connecting to Windows Azure data the reader will

Subscribe to a Microsoft Windows Azure Marketplace dataset

Create a Business Data Catalog model to access the dataset

Create an external list to expose the dataset

Create a Silverlight Web Part to display the data

Deploy the Web Part.

System Requirements

You must have the following items to complete this lab:

2010 Information Worker Demonstration and Evaluation Virtual Machine

Microsoft Visual Studio 2010

Bing Silverlight Map control and Bing Map Developer Id

Silverlight WebPart

Internet Access

Setup

You must perform the following steps to prepare your computer for this lab...

Download the 2010 Information Worker Demonstration and Evaluation Virtual Machine from http://tinyurl.com/2avoc4b and create the Hyper-V image.

Install the Visual Studio 2010 Silverlight Web Part. The Silverlight Web Part is an add-on to Visual Studio 2010.

Create a document library named SilverlightXaps located at http://intranet.contoso.com/silverlightxaps. This is where you will store the Silverlight Xap in SharePoint.

Create a Bing Map Developer Account.

Download and install the Bing Map Silverlight Control.

Exercises

This Hands-On Lab comprises the following exercises:

Subscribe to a Microsoft Windows Azure MarketPlace Dataset

Create a Business Data Catalog Model to Access the Dataset

Create an External List to Consume the MarketPlace Dataset

Create a Web Part to Display the MarketPlace Data

Deploy the Web Part

Estimated time to complete this lab: 60 minutes.

Starting Materials

This Hands-On Lab includes the following starting materials.

Visual Studio solutions. The lab provides the following Visual Studio solutions that you can use as starting point for the exercises.

<Install>\Labs\ACDM\Source\Begin\TheftStatisticsBDCModel \TheftStatisticsBDCModel.sln: This solution creates the Business Data Catalog application.

<Install>\Labs\ACDM\Source\Begin\TheftStatisticsWebPart \TheftStatisticsWebPart.sln: This solution creates the Silverlight Web Part and Silvelright application that will consume data using an external list.

Note:

Inside each lab folder, you will find an end folder containing a solution with the completed lab exercise.

Read more: next >

•• Elisa Flasko (@eflasko) posted Announcing More New DataMarket Content! to the Windows Azure Marketplace DataMarket Blog on 6/3/2011:

Have you ever wondered if there are any environmental hazards around your house? If so, we’ve got the solution! Environmental Data Resources, Inc. is publishing their Environmental Hazard Rank offering. The EDR Environmental Hazard Ranking System depicts the relative environmental health of any U.S. ZIP code based on an advanced analysis of its environmental issues. The results are then aggregated by ZIP code to provide you with a rank so you can see how the ZIP code you're interested in stacks up.

It’s Trivia Time! Today’s question: Is the buying power of US dollar greater today than it was in 1976? And today’s Bonus Question: was the buying power of US dollar greater in 1976 than it was in 1923? If you’re sitting there scratching your head thinking “Hmm, these are good questions, Mr. Azure DataMarket Blog Writer Man”, then fear not, MetricMash has got you covered. They’re publishing their U.S. Consumer Price Index - 1913 to Current offering.

This offering provides the changes in the prices paid by consumers for over 375 goods and services in major expenditure groups – such as food, housing, apparel, transportation, medical care and education cost. The CPI can be used to measure inflation and adjust the real value of wages, salaries and pensions for regulating prices and for calculating the real rate of return on investments. And, speaking of buying power, MetricMash is offering a free trial on this offering to make it easy for developers to use this information inside their apps.

Elisa’s MetricMash description sounds more to me like a Yelp review.

•• George Trifanov (@GeorgeTrifonov) posted ODATA WCF Data Services Friendly URLS using Routing

When you creating your first OData WCF Data service common tasks is to give a friendly URL instead of using filename.svc as entry point.You can archive it with URL routing feature in ASP.NET MVC.Just modify your Global.asax.cs route registration block to include following lines.
public static void RegisterRoutes(RouteCollection routes)
       {
           routes.Clear();
           var factory = new DataServiceHostFactory();
           RouteTable.Routes.Add(
               new ServiceRoute("API", factory, typeof (TestOdataService)));
It tells system to associate data service factory handler with a given URL.

•• Doug Finke (@dfinke) asserted One clear advantage of OData is its commonality and posted a link to his OData PowerShell Explorer in a 6/2/2011 article:

If you care at all about how data is accessed – and you should - understanding the basics of OData is probably worth your time.

Dave Chappell’s post, Introducing OData, does a great job walking through what OData is. Covering REST, Atom/AtomPub, JSON, OData filter query syntax, mobile devices, the Cloud and more.

Our world is awash in data. Vast amounts exist today, and more is created every year. Yet data has value only if it can be used, and it can be used only if it can be accessed by applications and the people who use them.

Allowing this kind of broad access to data is the goal of the Open Data Protocol, commonly called just OData. This paper provides an introduction to OData, describing what it is and how it can be applied.

OData PowerShell Explorer

I open sourced an OData PowerShell Explorer complete with a WPF GUI using WPK. The PowerShell module allows you to discover and drill down through OData services using either the command line or the GUI interface.

Other Resources

OData Primer

RunAs Radio podcast on the OData PowerShell Explorer

Open Data Protocol

Download TechEd Sessions Automatically by Using PowerShell

Doug is a Microsoft Most Valuable Professional (MVP) for PowerShell.

• Alex James (@adjames) answered How do I do design? to enable queries against properties of related OData collections in a 6/3/2011 post:

For a while now I've been thinking that the best way to get better at API & protocol design is to try to articulate how you do design.

Articulating your thought process has a number of significant benefits:

Once you know your approach you can critique and try to improve it.

Once you know what you'd like to do, in a collaborative setting you can ask others for permission to give it a go. In my experience phrases like "I'd like to explore how this is similar to LINQ ..." work well if you want others to brainstorm with you.

Once you understand what you do perhaps you can teach or mentor.

Or perhaps others can teach you.

Clearly these are compelling.

So I thought I'd give this a go by trying to document an idealized version of the thought process for Any/All in OData...

The Problem

OData had no way to allow people to query entities based on properties of a related collection, and a lot of people, myself included, wanted to change this...

Step 1: Look for inspiration

I always look for things that I can borrow ideas from. Pattern matching, if you will, with something proven. Here you are limited only by your experience and imagination, there is an almost limitless supply of inspiration. The broader your experience the more source material you have and the higher the probability you will be able to see a useful pattern.

If you need it – and you shouldn’t - this is just another reason to keep learning new languages, paradigms and frameworks!

Sometimes there is something obvious you can pattern match with, other times you really have to stretch...

In this case the problem is really easy. Our source material is LINQ. LINQ already allows you to write queries like this:

from movie in Movies
where movie.Actors.Any(a => a.Name == 'Zack')
select movie;

Which filters movies by the actors (a related collection) in the movie, and that is exactly what we need in OData.

Step 2: Translate from source material to problem domain

My next step is generally to take the ideas from the source material and try to translate them to my problem space. As you do this you'll start to notice the differences between source and problem, some insignificant, some troublesome.

This is where judgment comes in, you need to know when to go back to the drawing board. For me I know I'm on thin ice if I start saying things like 'well if I ignore X and Y, and imagine that Z is like this' .

But don't give up to early either. Often the biggest wins are gained by comparing things that seem quite different until you look a little deeper and find a way to ignore the differences that don't really matter. For example Relational Databases and the web seem very different, until you focus on ForeignKeys vs Hyperlinks, pull vs push seems different until you read about IQueryable vs IObservable.

Notice not giving up here is all about having tolerance for ambiguity, the higher your tolerance, the further you can stretch an idea. Which is vital if you want to be really creative.

In the case of Any/All it turns out that the inspiration and problem domain are very similar, and the differences are insignificant.

So how do you do the translation?

In OData predicate filters are expressed via $filter, so we need to convert this LINQ predicate:

(movie) => movie.Actors.Any(actor => actor.Name == 'Zack')

into something we can put in an OData $filter.

Let's attack this systematically from left to right. In LINQ you need to name the lambda variable in the predicate, i.e. movie, but in OData there is no need to name the thing you are filtering, it is implicit, for example this:

from movie in Movies
where movie.Name == "Donnie Darko"
select movie

is expressed like this in OData:

~/Movies/?$filter=Name eq 'Donnie Darko'

Notice there is no variable name, we access the Name of the movie implicitly.

So we can skip the ‘movie’.

Next in LINQ the Any method is a built-in extension method called directly off the collection using '.Any'. In OData '/' is used in place of '.', and built-in methods are all lowercase, so that points are something like this:

~/Movies/?$filter=Actors/any(????)

As mentioned previously in OData we don't name variables, everything is implicit. Which means we can ignore the actor variable. That leaves only the filter, which we can convert using existing LINQ to OData conversion rules, to yield something like this:

~/Movies/?$filter=Actors/any(Name eq 'Zack')

Step 3: Did you lose something important?

There is a good chance you lost something important in this transformation, so my next step is generally to assess what information has been lost in translation. Paying particular attention to things that are important enough that your source material had specific constructs to capture them.

As you notice these differences you need to either convince yourself they don’t matter, or you need to add something new to your solution to bring it back.

In our case you'll remember that in LINQ the actor being tested in the Any method had a name (i.e. 'actor'), yet in our current OData design it doesn't.

Is this important?

Yes it is! Unlike the root filter, where there is only one variable in scope (i.e. the movie), inside an Any/All there are potentially two variables in scope (i.e. the movie and the actor). And if neither are named we won't be able to distinguish between them!

For example this query, which finds any movies with the same name as any actors who star in the movie, is unambiguous in LINQ:

from movie in Movies
where movie.Actors.Any(actor => actor.Name == movie.Name)
select movie;

But in our proposed equivalent is clearly nonsensical:

~/Movies/?$filter=Actors/any(Name eq Name)

It seems we need a way to refer to both the inner (actor) and outer variables (movie) explicitly.

Now we can't change the way existing OData queries work - without breaking clients and servers already deployed - which means we can't explicitly name the outer variable, we can however introduce a way to refer to it implicitly. This should be a reserved name so it can't collide with any properties. OData already uses the $ prefix for reserved names (i.e. $filter, $skip etc) so we could try something like $it . This results in something like this:

~/Movies/?$filter=Actors/any(Name eq $it/Name)

And now the query is unambiguous again.

But unfortunately we aren't done yet. We need to make sure nesting works calls works too, for example this:

from movie in Movies
where movie.Actors.Any(actor => actor.Name == movie.Name && actor.Awards.Any(award => award.Name == 'Oscar'))
select movie;

If we translate this, with the current design we get this:

~/Movies/?$filter=Actors/any(Name eq $it/Name AND Awards/any(Name eq 'Oscar'))

But now it is unclear whether Name eq 'Oscar' refers to the actor or the award. Perhaps we need to be able to name the actor and award variables too. Here we are not restricted by the past, Any/All is new, so it can include a way to explicitly name the variable. Again we look at LINQ for inspiration:

award => award.Name == 'Oscar'

Cleary we need something like '=>' that is URI friendly and compatible with the current OData design. It turns out ':' is a good candidate, because it works well in querystrings, and isn’t currently used by OData, and even better there is a precedent in Python lambda’s (notice the pattern matching again). So the final proposal is something like this:

~/Movies/?$filter=Actors/any(actor: actor/Name eq $it/Name)

Or for the nested scenario:

~/Movies/?$filter=Actors/any(actor: actor/Name eq $it/Name AND actor/Awards/any(award: award/Name eq 'Oscar'))

And again nothing is ambiguous.

Step 4: Rinse and Repeat

In this case the design feels good, so we are done.

But clearly that won't always be the case. But you should at least know whether the design is looking promising or not. If not it is back to the drawing board.

If it does look promising, I would essentially repeat steps 1-3 again using some other inspiration to inform further tweaking of the design, hopefully to something that feels complete.

Conclusion

While writing this up I definitely teased out a number of things that had previously been unconscious, and I sure this will help me going forward, hopefully you too?

What you think about all this? Do you have any suggested improvements?

Next time I do this I'll explore something that involves more creativity... and I'll try to tease out more of the unconscious tools I use.

Bob Brauer (@sibob) will present an MSDN Webcast: Integrating High-Quality Data with Windows Azure DataMarket (Level 200) on 6/21/2011 at 8:00 AM PDT:

Event ID: 1032487799

Language(s): English.

Product(s): Windows Azure.

Audience(s): Pro Dev/Programmer.

All of us are aware of the importance of high-quality data, especially when organizations communicate with customers, create business processes, and use data for business intelligence. In this webcast, we demonstrate the easy-to-integrate data services that are available within Windows Azure DataMarket from StrikeIron. We look at the service itself and how it works. You'll learn how to integrate it and view several different use cases to see how customers are benefitting from this data quality service that is now available in Windows Azure DataMarket.

Presenter: Bob Brauer, Chief Strategy Officer and Cofounder, StrikeIron, Inc.

Bob Brauer is an expert in the field of data quality, including creating better, more usable data via the cloud. He first founded DataFlux, one of the industry leaders in data quality technology, which has been acquired by SAS Institute. Then he founded StrikeIron in 2003, leveraging the Internet to make data quality technology accessible to more organizations. Visit www.strikeiron.com for more information.

Steve Milroy reported DataMarket Mapper app provides quick, map-based visualization of DataMarket data to the Bing Maps blog on 6/2/2011:

Update: at 11:15am PT, we changed the hyperlink to the DataMarket Application so that our international viewers could access the app. Sorry for any inconvenience.

The Windows Azure DataMarket is a growing repository of data sources and services mediated by Microsoft. It allows customers to purchase vector and tiled datasets using Open Data Protocol (OData) specifications. Available datasets include weather, crime, demographics, parcels, plus many other layers. The OData working group has been looking at full spatial support for this specification; in the meantime, DataMarket datasets can still be used with geospatial applications and Bing Maps.

To further assist use of datasets, we are happy to announce the launch of DataMarket Mapper, a map app that allows quick and easy map-based visualization of DataMarket data. The app was developed by OnTerra Systems, along with the Microsoft DataMarket group. Look for it in the Bing Maps App Gallery. With a DataMarket subscription you can access layers, and visualize these layers on Bing Maps. If lat/longs don’t exist in the data source then the Bing Maps geocoder will geocode on the fly. Even if you don’t have a DataMarket account, we’ve made some crime and demographic data available in the app.

Figure 1 – DataMarket Mapper showing crime statistics per city using Data.gov crime data.

You can also use DataMarket data in Bing Maps applications using Bing Maps AJAX and Silverlight APIs. This involves using the vector shape or tile layer methods. Figure 2 shows parcel tile layer overlay on Bing Maps AJAX 7.0 Control API. If you have a DataMarket key with access to these layers you can also access these demos directly for testing/code samples:
•    Alteryx Demographics
•    BSI Parcels
•    Weatherbug Station Locations

Figure 2 – DataMarket parcel data from B.S.I. shown in a web mapping application using the Bing Maps AJAX 7 API.

We’re excited to launch this new app. After you visit the app and see it in action, we’d enjoy hearing your feedback. We’re already working on updates. In the near future, you’ll see the addition of country-based datasets, Windows Live ID/OAuth, and thematic mapping. If you have any questions or comments or would like help in building a custom Bing Maps application with DataMarket Mapper, please contact OnTerra Systems.com.

Steve is CEO, of OnTerra Systems

Windows Azure AppFabric: Access Control, WIF and Service Bus

••• Martin Ingvar Kofoed Jensen (@IngvarKofoed) described A basic inter webrole broadcast communication on Azure using the service bus in a 5/19/2011 article (missed when posted):

In this blog post I'll try to show a bare bone setup that does inter webrole broadcast communication. The code is based on Valery M blog post. The code in his blog post is based on a customer solution and contains a lot more code than needed to get the setup working. But his code also provides a lot more robust broadcast communication with retries and other things that makes the communication reliable. I have omitted all this to make it, as easy to understand and recreate as possible. The idea is that the code I provide, could be used as a basis for setting up your own inter webrole broadcast communication. You can download the code here: InterWebroleBroadcastDemo.zip (17.02 kb)

Windows Azure AppFabric SDK and using Microsoft.ServiceBus.dll

We need a reference to Microsoft.ServiceBus.dll in order to do the inter webrole communication. The Microsoft.ServiceBus.dll assembly is a part of the Windows Azure AppFabric SDK found here.
When you use Microsoft.ServiceBus.dll you need to add it as a reference like any other assembly. You do this by browsing to the directory where the AppFabric SDK was installed. But unlike most other references you add, you need to set the "Copy local" property for the reference to true (default is false).
I have put all my code in a separate assembly and then the main classes are then used in the WebRole.cs file. Even if I have added Microsoft.ServiceBus.dll to my assembly () and setted the "Copy Local" to true, I still have to add it to the WebRole project and also set the "Copy Local" to true here. This is a very important detail!

Creating a new Service Bus Namespace

Here is a short step-guide on how to create a new Service Bus Namespace. If you have already done this, you can skip it and just use the already existing namespace and its values.

Go to the section "Service Bus, Access Control & Caching"
Click the button "New Namespace"
Check "Service Bus"
Enter the desired Namespace (This namespace is the one used for EndpointInformation.ServiceNamespace)
Click "Create Namespace"
Select the newly created namespace
Under properties (To the right) find Default Key and click "View"
Here you will find the Default Issuer (This value should be used for EndpointInformation.IssuerName) and Default Key (This value should be used for EndpointInformation.IssuerSecret)

The code

Here I will go through all the classes in my sample code. The full project including the WebRole project can be download here: InterWebroleBroadcastDemo.zip (17.02 kb)
BroadcastEvent

We start with the BroadcastEvent class. This class represents the data we send across the wire. This is done with the class attribute DataContract and the member attribute DataMember. In this sample code I only send two simple strings. SenderInstanceId is not required but I use it to display where the message came from.

[DataContract(Namespace = BroadcastNamespaces.DataContract)]
public class BroadcastEvent
{
    public BroadcastEvent(string senderInstanceId, string message)
    {
        this.SenderInstanceId = senderInstanceId;
        this.Message = message;
    }

    [DataMember]
    public string SenderInstanceId { get; private set; }

    [DataMember]
    public string Message { get; private set; }
}

BroadcastNamespaces

This class only contains some constants that are used by some of the other classes.

public static class BroadcastNamespaces
{
    public const string DataContract = "http://broadcast.event.test/data";
    public const string ServiceContract = "http://broadcast.event.test/service";
}

IBroadcastServiceContract

This interface defines the contract that the web roles uses when communication to each other. Here in this simple example, the contract only has one method, namely the Publish method. This method is, in the implementation of the contract (BroadcastService) used to send BroadcastEvent's to all web roles that have subscribed to this channel. There is another method, Subscribe, that is inherited from the IObservable. This method is used to subscribe to the BroadcastEvents when they are published by some web role. This method is also implemented in the BroadcastService class.

[ServiceContract(Name = "BroadcastServiceContract", 
                 Namespace = BroadcastNamespaces.ServiceContract)]
public interface IBroadcastServiceContract : IObservable<BroadcastEvent>
{
    [OperationContract(IsOneWay = true)]
    void Publish(BroadcastEvent e);
}

IBroadcastServiceChannel

This interface defines the channel which the web roles communicates through. This is done by adding the IClientChannel interface.

public interface IBroadcastServiceChannel : IBroadcastServiceContract, IClientChannel
{
}

BroadcastEventSubscriber

The web role subscribes to the channel by creating an instance of this class and registering it. For testing purpose, this implementation only logs when it receives any BroadcastEvent.

public class BroadcastEventSubscriber : IObserver<BroadcastEvent>
{
    public void OnNext(BroadcastEvent value)
    {
        Logger.AddLogEntry(RoleEnvironment.CurrentRoleInstance.Id + 
                            " got message from " + value.SenderInstanceId + " : " +
                            value.Message);
    }

    public void OnCompleted()
    {
        /* Handle on completed */
    }

    public void OnError(Exception error)
    {
        /* Handle on error */
    }
}

BroadcastService

This class implements the IBroadcastServiceContract interface. It handles the publish scenario by calling the OnNext method on all subscribes in parallel. The reason for doing this parallel, is that the OnNext method is blocking, so there is a good change that there is a okay performance gain by doing this in parallel.
The other method is Subscribe. This method adds the BroadcastEvent observer to the subscribers and returns a object of type UnsubscribeCallbackHandler that, when disposed unsubscribe itself. This is a part of the IObserver/IObservable pattern.

[ServiceBehavior(InstanceContextMode = InstanceContextMode.Single, 
                 ConcurrencyMode = ConcurrencyMode.Multiple)]
public class BroadcastService : IBroadcastServiceContract
{
    private readonly IList<IObserver<BroadcastEvent>> _subscribers = 
            new List<IObserver<BroadcastEvent>>();

    public void Publish(BroadcastEvent e)
    {
        ParallelQuery<IObserver<BroadcastEvent>> subscribers = 
            from sub in _subscribers.AsParallel().AsUnordered() 
            select sub;

        subscribers.ForAll((subscriber) =>
        {
            try
            {
                subscriber.OnNext(e);
            }
            catch (Exception ex)
            {
                try
                {
                    subscriber.OnError(ex);
                }
                catch (Exception)
                {
                    /* Ignore exception */
                }
            }
        });
    }

    public IDisposable Subscribe(IObserver<BroadcastEvent> subscriber)
    {
        if (!_subscribers.Contains(subscriber))
        {
            _subscribers.Add(subscriber);
        }

        return new UnsubscribeCallbackHandler(_subscribers, subscriber);
    }


    private class UnsubscribeCallbackHandler : IDisposable
    {
        private readonly IList<IObserver<BroadcastEvent>> _subscribers;
        private readonly IObserver<BroadcastEvent> _subscriber;

        public UnsubscribeCallbackHandler(IList<IObserver<BroadcastEvent>> subscribers, 
                                           IObserver<BroadcastEvent> subscriber)
        {
            _subscribers = subscribers;
            _subscriber = subscriber;
        }

        public void Dispose()
        {
            if ((_subscribers != null) && (_subscriber != null) && 
                (_subscribers.Contains(_subscriber)))
            {
                _subscribers.Remove(_subscriber);
            }
        }
    }
}

ServiceBusClient

The main purpose of the ServiceBusClient class is setup and create a ChannelFactory<IBroadcastServiceChannel> and a IBroadcastServiceChannel instance through the factory. The channel is used by the web role to send BroadcastEvent's through the publish method. It is in this class all the Azure service bus magic happens. Setting up the binding and endpoint. A few service bus related constants is used here and they are all kept in the EndpointInformation class.

public class ServiceBusClient<T> where T : class, IClientChannel, IDisposable
{
    private readonly ChannelFactory<T> _channelFactory;
    private readonly T _channel;
    private bool _disposed = false;

    public ServiceBusClient()
    {
        Uri address = ServiceBusEnvironment.CreateServiceUri("sb", 
                EndpointInformation.ServiceNamespace, EndpointInformation.ServicePath);

        NetEventRelayBinding binding = new NetEventRelayBinding(
                EndToEndSecurityMode.None,  
                RelayEventSubscriberAuthenticationType.None);

        TransportClientEndpointBehavior credentialsBehaviour = 
                new TransportClientEndpointBehavior();
        credentialsBehaviour.CredentialType = 
                TransportClientCredentialType.SharedSecret;
        credentialsBehaviour.Credentials.SharedSecret.IssuerName = 
                EndpointInformation.IssuerName;
        credentialsBehaviour.Credentials.SharedSecret.IssuerSecret = 
                EndpointInformation.IssuerSecret;

        ServiceEndpoint endpoint = new ServiceEndpoint(
                ContractDescription.GetContract(typeof(T)), binding, 
                new EndpointAddress(address));
        endpoint.Behaviors.Add(credentialsBehaviour);

        _channelFactory = new ChannelFactory<T>(endpoint);

        _channel = _channelFactory.CreateChannel();
    }

    public T Client
    {
        get
        {
            if (_channel.State == CommunicationState.Opening) return null;

            if (_channel.State != CommunicationState.Opened)
            {
                _channel.Open();
            }

            return _channel;
        }
    }

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    public void Dispose(bool disposing)
    {
        if (!_disposed)
        {
            if (disposing)
            {
                try
                {
                    if (_channel.State == CommunicationState.Opened)
                    {
                        _channel.Close();
                    }
                    else
                    {
                        _channel.Abort();
                    }
                }
                catch (Exception)
                {
                    /* Ignore exceptions */
                }


                try
                {
                    if (_channelFactory.State == CommunicationState.Opened)
                    {
                        _channelFactory.Close();
                    }
                    else
                    {
                        _channelFactory.Abort();
                    }
                }
                catch (Exception)
                {
                    /* Ignore  exceptions */
                }

                _disposed = true;
            }
        }
    }

    ~ServiceBusClient()
    {
        Dispose(false);
    }
}

ServiceBusHost

The main purpose of the ServiceBusHost class is to setup, create and open a ServiceHost. The service host is used by the web role to receive BroadcastEvent's through registering a BroadcastEventSubsriber instance. Like the ServiceBusClient it is in this class all the Azure service bus magic happens.

public class ServiceBusHost<T> where T : class
{
    private readonly ServiceHost _serviceHost;
    private bool _disposed = false;

    public ServiceBusHost()
    {
        Uri address = ServiceBusEnvironment.CreateServiceUri("sb", 
                EndpointInformation.ServiceNamespace, EndpointInformation.ServicePath);

        NetEventRelayBinding binding = new NetEventRelayBinding(
                EndToEndSecurityMode.None, 
                RelayEventSubscriberAuthenticationType.None);

        TransportClientEndpointBehavior credentialsBehaviour = 
                new TransportClientEndpointBehavior();
        credentialsBehaviour.CredentialType = 
                TransportClientCredentialType.SharedSecret;
        credentialsBehaviour.Credentials.SharedSecret.IssuerName = 
                EndpointInformation.IssuerName;
        credentialsBehaviour.Credentials.SharedSecret.IssuerSecret = 
                EndpointInformation.IssuerSecret;

        ServiceEndpoint endpoint = new ServiceEndpoint(
                ContractDescription.GetContract(typeof(T)), binding, 
                new EndpointAddress(address));
        endpoint.Behaviors.Add(credentialsBehaviour);

        _serviceHost = new ServiceHost(Activator.CreateInstance(typeof(T)));

        _serviceHost.Description.Endpoints.Add(endpoint);

        _serviceHost.Open();
    }

    public T ServiceInstance
    {
        get
        {
            return _serviceHost.SingletonInstance as T;
        }
    }

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    public void Dispose(bool disposing)
    {
        if (!_disposed)
        {
            if (disposing)
            {
                try
                {
                    if (_serviceHost.State == CommunicationState.Opened)
                    {
                        _serviceHost.Close();
                    }
                    else
                    {
                        _serviceHost.Abort();
                    }
                }
                catch 
                {
                    /* Ignore exceptions */
                }
                finally
                {
                    _disposed = true;
                }
            }
        }
    }

    ~ServiceBusHost()
    {
        Dispose(false);
    }
}

EndpointInformation

This class keeps all the service bus related constants. I have put a dummy constant for the ServiceNamespace, IssuerName and IssuerSecret. These you have to find in the Windows Azure Management Portal [URL]. Read below how to create a new Service Bus and obtain these values.

public static class EndpointInformation
{
    public const string ServiceNamespace = "CHANGE THIS TO YOUR NAMESPACE";
    public const string ServicePath = "BroadcastService";
    public const string IssuerName = "CHANGE THIS TO YOUR ISSUER NAME";
    public const string IssuerSecret = "CHANGE THIS TO YOUR ISSUER SECRET";
}

BroadcastCommunicator

This class abstracts all the dirty details away and is the main class that the web role uses. It has two methods. Publish for publishing BroadcastEvent instances. And Subscribe for subscribing to the broadcast events by creating an instanse of the BroadcastEventSubscriber and handing it to the Subscribe method.

public class BroadcastCommunicator : IDisposable
{
    private ServiceBusClient<IBroadcastServiceChannel> _publisher;
    private ServiceBusHost<BroadcastService> _subscriber;
    private bool _disposed = false;

    public void Publish(BroadcastEvent e)
    {
        if (this.Publisher.Client != null)
        {
            this.Publisher.Client.Publish(e);
        }
    }

    public IDisposable Subscribe(IObserver<BroadcastEvent> subscriber)
    {
        return this.Subscriber.ServiceInstance.Subscribe(subscriber);
    }

    private ServiceBusClient<IBroadcastServiceChannel> Publisher
    {
        get
        {
            if (_publisher == null)
            {
                _publisher = new ServiceBusClient<IBroadcastServiceChannel>();
            }

            return _publisher;
        }
    }

    private ServiceBusHost<BroadcastService> Subscriber
    {
        get
        {
            if (_subscriber == null)
            {
                _subscriber = new ServiceBusHost<BroadcastService>();
            }

            return _subscriber;
        }
    }

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    public void Dispose(bool disposing)
    {
        if (!_disposed && disposing)
        {
            try
            {
                _subscriber.Dispose();
                _subscriber = null;
            }
            catch
            {
                /* Ignore exceptions */
            }

            try
            {
                _publisher.Dispose();
                _publisher = null;
            }
            catch
            {
                /* Ignore exceptions */
            }

            _disposed = true;
        }
    }

    ~BroadcastCommunicator()
    {
        Dispose(false);
    }
}

WebRole

This is a pretty strait forward web role. In the OnStart method a instance of the BroadcastCommunicator is created and an instance of BroadcastEventSubscriber is used to subscribe to the channel.
The Run method is a endless loop with a random sleep in every loop, for testing purpose. In every loop it sends a "Hello World" message including its own role instance id.
The OnStep method cleans up by disposing disposable objects.

public class WebRole : RoleEntryPoint
{
    private volatile BroadcastCommunicator _broadcastCommunicator;
    private volatile BroadcastEventSubscriber _broadcastEventSubscriber;
    private volatile IDisposable _broadcastSubscription;
    private volatile bool _keepLooping = true;


    public override bool OnStart()
    {
        _broadcastCommunicator = new BroadcastCommunicator();
        _broadcastEventSubscriber = new BroadcastEventSubscriber();

        _broadcastSubscription = 
            _broadcastCommunicator.Subscribe(_broadcastEventSubscriber);

        return base.OnStart();
    }



    public override void Run()
    {
        /* Just keep sending messasges */
        while (_keepLooping)
        {
            int secs = ((new Random()).Next(30) + 60);

            Thread.Sleep(secs * 1000);
            try
            {
                BroadcastEvent broadcastEvent = 
                    new BroadcastEvent(RoleEnvironment.CurrentRoleInstance.Id, 
                        "Hello world!");

                _broadcastCommunicator.Publish(broadcastEvent);
            }
            catch (Exception ex)
            {
                Logger.AddLogEntry(ex);
            }
        }
    }

    public override void OnStop()
    {
        _keepLooping = false;

        if (_broadcastCommunicator != null)
        {
            _broadcastCommunicator.Dispose();
        }

        if (_broadcastSubscription != null)
        {
            _broadcastSubscription.Dispose();
        }

        base.OnStop();
    }
}

Logger

The logger class is used some places in the code. If a logger action has been set, logging will be done. Read more about how I did logging below.

public static class Logger
{
    private static Action<string> AddLogEntryAction { get; set; }

    public static void Initialize(Action<string> addLogEntry)
    {
        AddLogEntryAction = addLogEntry;
    }

    public static void AddLogEntry(string entry)
    {
        if (AddLogEntryAction != null)
        {
            AddLogEntryAction(entry);
        }
    }

    public static void AddLogEntry(Exception ex)
    {
        while (ex != null)
        {
            AddLogEntry(ex.ToString());

            ex = ex.InnerException;
        }
    }
}

Simple but effective logging

When I developed this demo I used a web service on another server for logging. This web service just have one method taking one string argument, the line to log. Then I have a page for displaying and clearing the log. This is a very simple way of doing logging, but it gets the job done.

The output

Below is the output from a run of the demo project with 4 web role instances. Here the first two lines are most interesting. Here you can see that only web role instance WebRole_IN_0 and WebRole_IN_2 are ready to receive (and send) events. The reason for this is the late creation (create when needed) of the ServiceBusClient and ServiceBusHost in the BroadcastCommunicator class and the sleep period in the WebRole. This illustrates that web roles can join the broadcast channel at any time and start sending and receiving events.
20:30:40.4976 : WebRole_IN_2 got message from WebRole_IN_0 : Hello world!
20:30:40.7576 : WebRole_IN_0 got message from WebRole_IN_0 : Hello world!
20:30:43.0912 : WebRole_IN_2 got message from WebRole_IN_3 : Hello world!
20:30:43.0912 : WebRole_IN_1 got message from WebRole_IN_3 : Hello world!
20:30:43.0912 : WebRole_IN_0 got message from WebRole_IN_3 : Hello world!
20:30:43.1068 : WebRole_IN_3 got message from WebRole_IN_3 : Hello world!
20:30:45.4505 : WebRole_IN_0 got message from WebRole_IN_2 : Hello world!
20:30:45.4505 : WebRole_IN_3 got message from WebRole_IN_2 : Hello world!
20:30:45.4505 : WebRole_IN_1 got message from WebRole_IN_2 : Hello world!
20:30:45.4662 : WebRole_IN_2 got message from WebRole_IN_2 : Hello world!
20:30:59.4816 : WebRole_IN_0 got message from WebRole_IN_1 : Hello world!
20:30:59.4816 : WebRole_IN_3 got message from WebRole_IN_1 : Hello world!
20:30:59.4972 : WebRole_IN_2 got message from WebRole_IN_1 : Hello world!
20:30:59.4972 : WebRole_IN_1 got message from WebRole_IN_1 : Hello world!
20:31:59.1371 : WebRole_IN_2 got message from WebRole_IN_3 : Hello world!
20:31:59.2621 : WebRole_IN_1 got message from WebRole_IN_3 : Hello world!
20:31:59.3871 : WebRole_IN_0 got message from WebRole_IN_3 : Hello world!
20:31:59.5746 : WebRole_IN_3 got message from WebRole_IN_3 : Hello world!
20:32:03.1683 : WebRole_IN_2 got message from WebRole_IN_0 : Hello world!
20:32:03.1683 : WebRole_IN_0 got message from WebRole_IN_0 : Hello world!
20:32:03.1683 : WebRole_IN_1 got message from WebRole_IN_0 : Hello world!
20:32:03.1839 : WebRole_IN_3 got message from WebRole_IN_0 : Hello world!

•• Alik Levin (@alikl) described Windows Azure AppFabric Access Control Service (ACS): REST Web Services And OAuth 2.0 Delegation in a 6/3/2011 post:

Scenario

Following are characteristics of the scenario:

RESTful web service requires SWT token.

Credentials validated by the same authority that exposes the RESTful web service.

RESTful web service is accessed by intermediary and not by the end user.

Credentials must not be shared with intermediary.

Solution

Use ACS as an OAuth authorization server.

Use WIF Extensions for OAuth CTP.

Supporting Materials

Code Sample: OAuth 2.0 Delegation

Related Resources

Video: What is ACS? (Slides)

Video: What ACS Can Do For Me? (slides)

Video: ACS Functionality (slides)

Video: ACS Architecture (slides)

Video: ACS Deployment Scenarios (slides)

Video: ACS and the Cloud (slides)

Video: ACS And WIF (slides)

Video: ACS and ADFS (slides)

•• Azret Botash (@Ba3e64) offered a Sneak Peek! OAuth Library – (coming soon in v2011, volume 1) on 6/3/2011:

In v2011 vol 1, we are going to make our OAuth library that we’ve developed for internal use, available to all our customers. The DevExpress OAuth library will support the consumer side for OAuth 1.0 and OAuth 2.0 Draft as well as the provider side for OAuth 1.0.

The following example How to login to Twitter, Google and Facebook using OAuth demonstrates how to login to some of the most popular services that support OAuth authentication.

• Scott Densmore (@scottdensmore) announced A Guide to Claims Based Identity - RC is Out on 6/3/2011:

We just released our Release Candidate for the book that should be out in the next month. This is the second version of the original that includes new content on Single Sign Out, Windows Azure Access Control Service and SharePoint. If you are interested in Identity (and you should be), you need to get this guide. We would love any feedback before this is out. Download and become a claims / identity expert.

Clemens Vasters (@clemensv) and Sunghwa Jin (@talksunghwa) presented MID312: Windows Azure AppFabric Service Bus: New Capabilities at TechEd North America 2011 on 5/17/2011 at 3:15 PM:

At PDC 2010,the Windows Azure AppFabric team provided customers with the first glimpse into the future of Service Bus. Since then,we've wrapped up another milestone on our way and will soon be releasing a major milestone by Tech·Ed. In this session we provide insight into how Service Bus will evolve,what new capabilities we're adding and what changes we're making,and what role Service Bus will play in the greater AppFabric services story.

Watch video here; download slides here.

This is a GREAT presentation! Don’t miss it.

Windows Azure VM Role, Virtual Network, Connect, RDP and CDN

•• The Windows Azure Connect Team explained Speeding Up SQL Server Connections in a 6/3/2011 post:

We’ve heard from some customers that initial connections to on-premise SQL servers using Windows Azure Connect sometimes takes a long time if the Azure machines are domain-joined. On investigating the issue, we’ve found out that all current versions of SQL Client attempt to connect via IPv4 before IPv6 regardless of system settings (more details here). Normally, when you connect to a machine using Windows Azure Connect, the Connect endpoint looks up the name and returns an IPv6 address. However, when your Azure VM is domain joined, it can look up the name in your on-premise DNS server as well, which returns an IPv4 address. When that happens, SQL client chooses to use IPv4 address first and needs to time out the IPv4 connection attempt before it can connect through IPv6.

We’ve identified a simple workaround to avoid the timeout and speed up connections: create a firewall rule on your Azure roles to block outbound connections to SQL over IPv4. That causes the incorrect IPv4 connection to fail immediately instead of timeout. The easiest way to accomplish that is to add a startup task to your role that runs a command like:

netsh advfirewall firewall add rule name="BlockIPv4SQL" dir=out action=block protocol=tcp remoteport=1433 remoteip=(your on-premise IPv4 range)

Note that if you use SQL Azure in addition to SQL over Windows Azure Connect, you will need to ensure the the remoteip range in the rule exempts traffic to your SQL Azure servers.

If you’re looking for other performance improvements, make sure you’re using a relay close to you.

• Peter Meister presented What Are the Bridges between Private and Public Cloud? to Tech Ed North America 2011 on 5/17/2011 at 1:30 PM:

In this session we look at the core Hybrid Cloud topologies capable with Windows Azure and Windows Server Private Cloud Architecture. We showcase bridging the two environments together and how you can leverage private and public cloud to scale your enterprise needs.

Watch the video here and download slides here.

Lori MacVittie (@lmacvittie) asked World IPv6 Day is June 8. We’re ready, how about you? as an introduction to her F5 Friday: Thanks for calling... please press 1 for IPv6 or 2 for IPv4 post of 6/3/2011 to F5’s DevCentral blog:

World IPv6 day, scheduled for 8 June 2011, is a global-scale test flight of IPv6 sponsored by the Internet Society. On World IPv6 Day, major web companies and other industry players will come together to enable IPv6 on their main websites for 24 hours. The goal is to motivate organizations across the industry — Internet service providers, hardware makers, operating system vendors and web companies — to prepare their services for IPv6 to ensure a successful transition as IPv4 address space runs out.

This is more than a marketing play to promote IPv6 capabilities, it’s a serious test to ensure that services are prepared to meet the challenge of a dual-stack environment of the kind that will be necessary to support the migration from IPv4 to IPv6. Such a migration is not a trivial task as it requires more than simply flipping a switch in the billions of components, applications and services that make up what we call “The Internet”. That’s because IPv6 shares basic concepts like routing, switching and internetworking communication with IPv4, but the technical bits that describe hosts, services and endpoints on the Internet and in the data center are different enough to make cross-protocol communication challenging.

Supporting IPv6 is easy; supporting communication between IPv6 and IPv4 during such a massive migration is not.

If you consider how tightly coupled not only routing and switching but applications and myriad security, acceleration, access and application-centric networking policies are to IP you start to see how large a task such a migration really will be. cloud computing hasn’t helped there by relying on IP address as the primary mechanism for identifying instances of applications as they are provisioned and decommissioned throughout the day. All that eventually needs to change, to be replaced with IPv6 compatible systems, components and management frameworks, and it’s not going to happen in a single day.

FIRST THINGS FIRST

The first step is simply to lay the foundation for services and core Internet communications to support IPv6, and that’s what World IPv6 Day is promoting – an IPv6 Internet with IPv6 capable services on the outside interacting with other IPv6 capable services and networking components and clients. In many ways, World IPv6 Day will illustrate the power of loose coupling, of service-oriented networking and architectures.
Most organizations aren’t ready for the gargantuan task of migrating their data centers to IPv6, nor the investment that may be required in upgrading or replacing core infrastructure to support the new standard. The beautify of loose-coupling and translative gateways, however, is that they don’t have to – yet.
As part of our participation in World IPv6 Day, F5’s IT team worked hard - and ate a whole lot of our own dog food - to ensure that users have a positive experience while browsing our sites from an IPv6 device. That means you don’t have to press “1” for IPv6 or “2” for IPv4 as you do when communicating with organizations that supporting multiple languages.

Like our own customers, we have an organizational reliance on IP addresses in the network and application infrastructure that thoroughly permeates throughout configurations and even application logic. But leveraging our own BIG-IP Local Traffic Manager (LTM) with IPv6 Gateway Module means we don’t have to perform a mass IPectomy on our entire internal infrastructure. Using the IPv6 gateway we’re able to maintain our existing infrastructure – all talking IPv4 – while providing IPv6 interfaces to Internet-facing infrastructure and clients. Both our corporate site (www.f5.com) and our community site (devcentral.f5.com) have been “migrated” to IPv6 and stand ready to speak what will one day be the lingua franca of the Internet.

Granted, we had some practice at Interop 2011 supporting the Interop NOC IPv6 environment. F5 provided network and DNS translations and facilitated access and functionality for both IPv4 and IPv6 clients to resources on the Interop network. F5 also provided an IPv6 gateway to the www.interop.com website.

Because organizations can continue to leverage IPv4 internally with an IPv6 gateway – and thus make no changes to its internal architecture, infrastructure, and applications – there is less pressure to migrate immediately, which can reduce the potential for introducing problems that cause downtime and outages. As Mike Fratto mentioned when describing Network Computing’s IPv6 enablement using BIG-IP:

Like many other organizations, we have to migrate to IPv6 at some point, and this is the first step in the process--getting our public-facing servers ready. There is no rush to roll out massive changes, and by taking the transition in smaller bits, you will be able to manage the transition seamlessly.

A planned, conscious effort to move to IPv6 internally in stages will reduce the overall headaches and inevitable outages caused by issues sure to arise during the process.

F5 and IPv6

F5 BIG-IP supports IPv6 but more importantly its IPv6 Gateway Module supports efforts to present an IPv6 interface to the public-facing world while maintaining existing IPv4 based infrastructure.

Deploying a gateway can provide the translation necessary to enable the entire organization to communicate with IPv6 regardless of IP version utilized internally. A gateway translates between IP versions rather than leveraging tunneling or other techniques that can cause confusion to IP-version specific infrastructure and applications. Thus if an IPv6 client communicates with the gateway and the internal network is still completely IPv4, the gateway performs a full translation of the requests bi-directionally to ensure seamless interoperation. This allows organizations to continue utilizing their existing investments – including network management software and packaged applications that may be under the control of a third party and are not IPv6 aware yet – but publicly move to supporting IPv6.

Additionally, F5 BIG-IP Global Traffic Manager (GTM) handles IPv6 integration natively when answering AAAA (IPv6) DNS requests and includes a checkbox feature to reject IPv6 queries for Wide IPs that only have IPv4 addresses, which forces the client DNS resolver to re-request asking for the IPv4 address. This solves a common problem with deployment of dual stack IPv6 and IPv4 addressing. The operating systems try to query for an IPv6 address first and will hang or delay if they don’t get a rejection. GTM solves this problem by supporting both address schemes simultaneously.

Learn More

For Enterprises

Controlling Your Migration to IPv6: A Gateway to Tomorrow

IPv6—101: Introduction

IPv6 - Bridging the Gap to Tomorrow

For Service Providers

Managing IPv6 in Service Provider Networks with BIG-IP Devices

Service Provider Series: Managing the IPv6 Migration

If you’ve got an IPv6-enabled device, give the participating sites on June 8 a try. While we’ll all learn a lot about IPv6 and any potential pitfalls with a rollout throughout the day just by virtue of the networking that’s always going on under the hood, without client participation it’s hard to gauge whether there’s more work to be done on that front. Even if your client isn’t IPv6 enabled, give these sites a try – they should be supporting both IPv6 and IPv4, and thus you should see no discernable difference when connecting. If you do, let us (or the site you’re visiting) know – it’s important for everyone participating in IPv6 day to hear about any unexpected issues or problems so we can all work to address them before a full IPv6 migration gets under way.

You can also participate on DevCentral:

Post your IPv6 questions and our DevCentral team members will do their best to answer them.

Join our live roundtable podcast on June 8 at 11:00 a.m. (Pacific) to hear your IPv6 questions answered and get professional tips from F5's IPv6 expert guests.

So give it a try and participate if you can, and make it a great day!

Window Azure Connect uses IPv6 to connect local Windows Vista, Windows 7, Windows Server 2008, and Windows Server 2008 R2 machines to Windows Azure instances. These four OSes support IPv6 natively. See the MSDN Library’s Overview of Firewall Settings Related to Windows Azure Connect topic, updated 11/2010:

[This topic contains preliminary content for the beta release of Windows Azure Connect. To join the program, log on to the ManagementPortal, click Home, and then click Beta Programs.]

In Windows Azure Connect, the firewall settings on local endpoints (local computers or VMs) are under your control. Windows Azure Connect uses HTTPS, which uses port 443. Therefore, the port that you must open on local endpoints is TCP 443 outbound. In addition, configure program or port exceptions needed by your applications or tools.

Note then when you install the local endpoint software, a firewall rule is created for Internet Control Message Protocol version 6 (ICMPv6) communication. This rule allows ICMPv6 Router Solicitation and Router Advertisement (Type 133 and Type 134) messages, which are essential to the establishment and maintenance of an IPv6 local link. Do not block this communication. [Emphasis added.]

When you activate a role for Windows Azure Connect, the firewall settings for the role are configured automatically by Windows Azure. In addition to these firewall settings, you might need to configure program or port exceptions needed by your applications or tools. Otherwise, we recommend that you do not change the firewall settings on an activated role.

Additional references

Managing Windows Firewall with Advanced Security (http://go.microsoft.com/fwlink/?LinkId=206659)

Troubleshooting Windows Azure Connect

For more information about Windows Azure Connect’s use of IPv6, see the MSDN Library’s Connecting Local Computers to Windows Azure Roles topic and its subtopics.

Live Windows Azure Apps, APIs, Tools and Test Harnesses

••• Steve Nagy (@snagy, pictured below) posted his version of Steve Marx’s Ten Things You Didn’t Know About Windows Azure on 6/5/2011:

Steve Marx did a talk at the MVP Summit at the beginning of the year about things you can do with Windows Azure, and the talk then featured at Mix as well. I followed his lead and delivered a similar talk at Remix Australia which was titled ‘10 Things you didn’t know about Windows Azure’.

Since my blogging has been somewhat lax of late, I wanted to take this opportunity to quickly cover off those things I spoke about, in case you also were unaware of some of the things you can do.

Combine Web and Worker Roles

I talked about web and worker roles quite some time ago. The key premise is the same; web roles are great for hosting websites and services behind load balancers, while worker roles are like windows services and good for background processing. But perhaps you were unaware that you can combine worker role functionality into your web role?

Doing so is really quite easy. Both worker roles and web roles can have a class added to the project that inherits from ‘RoleEntryPoint’ (class remarks indicate that this is mandatory for worker roles). Inside this class is a method that can be overridden called ‘Run’. When you do so, you can provide code similar to the worker role code that starts an infinite loop and performs some ‘work’. By overriding this method in your web role you can achieve worker role behaviour.

Warning: The default implementation in RoleEntryPoint.Run is to sleep indefinitely and never return; returning from the Run method will result in the instance being recycled. Therefore, if you override the Run method, make sure that if you finish doing work, you still sleep indefinitely as well (perhaps just call Base.Run).

Extra Small Instance for VM Sizes

Previously the smallest instance size was a ‘small’ which comprised of a single core at 1.6GHz, 1.75Gb memory, and about 100Mbps to the virtual network card, all for around 13 cents an hour. From there, the other VM sizes increase uniformly; medium is 2 cores at the same speed, twice as much RAM and speed, and of course price; large is double medium, and extra large is double large.

One of the great thing about Azure is that the cores are dedicated, so how can you go smaller than a single core without sharing? Well the extra small instance is in fact a slower core at 1.0GHz and half as much memory. The network speed is the biggest drop, at only 5Mbps, but the cost is also quite low, coming in at around 5 cents per hour.

Host a Website Purely in Windows Azure Storage

Ok so the website would have to be reasonably static.. your HTML, CSS, and JavaScript files can all be stored in blob storage and even delivered by the Windows Azure CDN. But you could take it a step further and store a Silverlight XAP file in blob storage, and perhaps it could even talk to table storage to pull lists of data. Keep in mind though that you want this to be read-only; don’t store your storage key in the Silverlight XAP file because anyone could disassemble it and get your key (remember Silverlight runs on the client, not server).

Note also that blob storage does not support a notion of ‘default’ files to execute in folders. This means that even if you do have a static site hosted all in blob storage, the client would need to specify the exact URL. For example, this link works, but this link does not.

Lease Blobs

The REST API around blobs allows so much more than the current SDK exposes strongly typed classes for. Blob leases are one such example, where you can take a lease on a blob to prevent another process from modifying it. Leases will return a lease ID which can be used in subsequent operations. Leases expire after a certain time out, or can otherwise be released. Steve Marx describes the code required in this post.

Multiple Websites in a Web Role

Previously one website meant one web role, but now you can change the service definition so that it includes another directory as part of your web role. That directory can be configured to a different binding and this means deploying to the underlying role IIS as a second website. You can differentiate these websites either by using different port bindings (eg. 80 for one and 8080 for another site) or you can use a host header indicating your custom domain name should direct to the second site.

Role Startup Tasks

It is now possible to execute a task at start-up of your instance. Most of the work we need to do to configure our machines requires higher privileges than what we would like to run our apps in, so it’s not acceptable to simply put code in our Application_Start events. With start-up tasks we can run a batch file elevated before starting the main application in a more secure mode.

We achieve this by creating a batch file with our instructions, and include that as part of the package. We then specify the batch file to run at start-up in our service definition file. It could look a little similar to this:

Run Cool Stuff

The great thing about start-up tasks is that they not only enable the dependencies of our application, but also let us run other cool stuff, including other web servers! Here’s a (brief) list to get you thinking:

Cassandra

CouchDB | MongoDB | RavenDB

Node.js

PHP | Python | Ruby

Java

Classic ASP!

Ok maybe I lost you on that last one…

Securely Connect On-Premise Servers to Roles

Pegged as the part of the Windows Azure Virtual Network offering, Windows Azure Connect is a service that allows you to connect any machine to a ‘role’, which means all of the instances that make up that role. It uses IPSec to ensure point-to-point security between your server and the role instances running in the cloud. The role instances can effectively see your server as if it was on the local network, and address it directly. This opens up a number of hybrid cloud scenarios, here’s just a sampler:

Keeping that Oracle database backend located on premise

Using your own SMTP server without exposing it to the internet

Joining your Azure role machines to your private domain

… and using Windows Authentication off of that domain to login to your Azure apps

Manage Traffic Between Datacentres

The other part of the Windows Azure Virtual Network offering is the Windows Azure Traffic Manager. This service allows us to do one of three things:

Fail over from one Azure datacentre to another

Direct user requests for our site to the nearest datacentre

Round robin requests between datacentres

This service requires you to deploy your application more than once. For example, you might deploy your site to the Singapore datacentre, deploy it to the same datacentre a second time, and then deploy it to the South Central US datacentre. You could then configure the traffic manager to failover in that order, such that if your first deployment in Singapore died, it would start diverting traffic to the second deployment in Singapore, and if that also died (for example the whole datacentre was overrun in the zombie apocalypse) then your traffic would then be redirected to South Central US.

Traffic Manager doesn’t actually direct traffic like a switch; it is only responsible for name resolution. When a failover occurs, it starts handing out resolutions pointing to the second location. In the interim some browsers will still have entries indicating the first location’s IP address; those will continue to fail until the TTL (time-to-live) on the DNS resolution has expired before it makes a new request to the Traffic Manager.

CNAME Everything

You can now put your own CNAME on pretty much everything in Windows Azure. This allows you to make sites look like your own. For example, this site demonstrates a site running from blob storage with its own domain name, and if you view source you will see that the images are also being served from a custom domain which wraps up the CDN. Similarly you can also wrap up your web roles in your own domain, and when running multiple websites from a single web role you can use a domain name as the host header to identify which website to target. Finally the endpoint for the new Windows Azure Traffic Manager can also be aliased as by your own domain.

Summary

There you have it. My slides from Remix were essentially just the 10 points, so rather than upload them somewhere to be downloaded, I thought I’d just share the main points above instead.

If you want to learn more things you might not know about Windows Azure, check out Steve Marx’ presentation from Mix 11 or check out his ‘things’ site which will also give you some insight:

http://things.smarx.com/

“[S]omewhat lax of late” = six months with no posts. Check Steve’s confession here.

Robin Shahan (@robindotnet) posted Exploring Visual Studio 2010 Tools for Windows Azure: A Tutorial to The Code Project on 6/3/2011:

This article is in the Product Showcase section for our sponsors at The Code Project. These reviews are intended to provide you with information on products and services that we consider useful and of value to developers.

Download now

Windows Azure is Microsoft’s cloud operating system. The tools integrated into Visual Studio make Windows Azure quick and easy for developers who are familiar with .NET development to adopt. The tools provide a steamlined way to create, develop, and publish cloud projects, as well as view storage data.

First, you must have some version of SQLServer installed – this can be any flavor of SQLServer 2008, or SQLServer Express 2005. This is for emulating the cloud storage when you test your application locally. If you already have VS2010 installed, then SQLServer should`ve been installed automatically.

Next, you will need to download and install the Windows Azure Tools and SDK. You can find the Windows Azure Tools & SDK here: http://www.microsoft.com/windowsazure/getstarted/

Note: When you select File à New Project below and start a cloud project, Visual Studio will download the latest Azure tools if they’re not currently installed. See the below screen capture as an example of what it looks like when Azure tools are not installed.

The Tools for Visual Studio add the ability to create cloud projects that you can test locally and publish to Windows Azure. Let’s start by running Visual Studio in administrator mode.

Select File / New Project. Under either Visual Basic or Visual C#, you should now see a category for Cloud as illustrated in Figure 1. Click on it, fill in the information at the bottom and click OK.

Figure 1: Creating a cloud project in Visual Studio 2010.

Next you will be prompted for the type of role that you want to use. This will host the actual code. There are only two types of roles – web roles and worker roles. Web roles use IIS by default and worker roles do not. So if you are going to create a web application or WCF service, you will want to use a web role.

Worker roles are used more for processing. I’ve used them in cases where I used to have a Windows service running on a server. For example, if you are taking wav files and converting them to MP3 asynchronously, you could submit that to a worker role and let it do it for you.

Select an ASP.NET Web Role. On the right, if you hover over the first line, a pencil will appear on the right – click it in order to edit the name of the Web Role. I’m going to name my web role “AwesomeWebApp” instead of “WebRole1”, as shown in Figure 2.

Figure 2: Adding a web role.

After clicking OK, you should have something similar to Figure 3.

Figure 3: New web application in a web role.

There are two projects. AwesomeWebApp is the web role. This is what will actually run in the instance on Windows Azure. The second one is the cloud project. This contains the role itself and the service configuration and service definition files. These apply to all of the instances of the role that are running. Let’s look at the service configuration first.

In Figure 4, I have changed two things. The osFamily value determines whether the instance runs Windows Server 2008 (osFamily = “1”) or Windows Server 2008 R2 (osFamily = “2”). I always want to run the most recent version, so I’ve changed this to the latter. I’ve also added some more settings. You will probably want to move some of your settings from your website’s web.config file to your service configuration, because you can modify your Service Configuration file while the instance is running, but you can’t modify the web.config file – you have to fully redeploy the application for changes to the web.config to take effect. For example, I put settings in my service configuration for the frequency of performance counter logging, so I can raise and lower it without needing to re-publish the entire project.

Figure 4: Service Configuration

The Service Definition is just a definition of what variables are in the Service Configuration file. Mine is displayed in Figure 5.

Figure 5: Service Definition.

You can also edit the values of the service configuration through the role properties. To see the role properties displayed in Figure 6, just double-click on the role in Solution Explorer.

Figure 6: Role Properties

Here you can set the basic properties of your role, including the number of instances and the size of the VM that you want to use. You can also specify a connection string for the Diagnostics, which are stored by default in Windows Azure Storage. You can use the EndPoints tab to manage the endpoints for the application.

The Certificates tab is for specifying the SSL certificate used when you are using https endpoints and/or when you enable RDP access to the role instance. Local Storage is used to configure local file system storage resources for each instance, and whether to clear them when the role recycles.

Rather than edit the configuration settings in the XML, you can use the grid in the settings tab, displayed in Figure 7, to edit the values of the settings you have already defined, and to add new settings. If you add a new setting here, it automatically adds it to the service definition as well.

Figure 7: UI for editing settings.

Hit F5 in Visual Studio to run your Windows Azure instance just the way you would run any other application in Visual Studio. This runs the role instance(s) in the “development fabric”, which simulates running in production. Note that just because something works in the development fabric, it doesn’t mean you can be 100% certain it will run when you publish it to Windows Azure, but it will get you most of the way there and it doesn’t cost you anything to use it.

Your browser should open, showing your running web application.

You will see the Windows Azure icon in your system tray. If you right-cilck on it, you can view the Windows Azure Compute Emulator (where you can see your role running or not running, whichever the case may be). Mine is displayed in Figure 8.

Figure 8: Windows Azure Compute Emulator

From here, you can attach a debugger to an instance, view the logging from the startup of each instance, and manage the service in the development fabric (restart it, stop it, start it, etc.). You can also get the IP address for the service, in case you didn’t check the box in the role properties to have the http page start up automatically when the service started.

If you look at the Storage Emulator, you should see something similar to Figure 9.

Figure 9: Storage Emulator

This gives you then endpoints for storage if you need them, and enables you to manage the development table storage (click Reset to clear it out).

Let’s publish the application to the cloud. First go into Settings and set the connection string for Windows Azure Storage to point to your storage in Windows Azure. Then right-click on the cloud project and select Publish. You will get the dialog displayed in Figure 10.

Figure 10: Publish the project to Windows Azure

If your role targets the .NET 4 Framework and you are running Visual Studio 2010 Ultimate, you will be able to enable Intellitrace for your role. If you do this, you can then see the Intellitrace output and use it to debug your role.

If you select the blue link that says “Configure Remote Desktop connections…”, you can configure the role to allow you to use Remote Desktop to connect and log into the instance after it has completed starting up. You will have to create a certificate and upload it to your hosted service before you can do this. I’m going to skip this for now and click on OK.

Visual Studio will build the solution, create a service package, upload it, create a new VM, and deploy the package to it. You can see it progress in the Windows Azure Activity Log in Visual Studio in Figure 11. When it has successfully completed, it will say “Complete”.

Figure 11: Windows Azure Activity Log displayed while publishing to the cloud.

You can also watch the progress in the Windows Azure Portal (http://windows.azure.com) as shown in Figure 12.

Figure 12: Viewing the status of the deployment in the portal.

The Portal can be used to manage all of your services and your storage accounts. You can also edit the service configuration of your deployment through the portal. If you have configured RDP, this is where you would connect to your instance.

After your role publishes successfully, you should be able to open the URL in the browser and see your web application running in the cloud. You can actually click on the link in the Windows Azure Activity Log once the role is complete, as seen in Figure 13.

Figure 13: Completed deployment.

You will also see the link that says “Open in Server Explorer”. If you click on that link, Server Explorer will be displayed in Visual Studio, and you will be able to see your instance running, as displayed in Figure 14.

Figure 14: Server Explorer

You can also view your Windows Azure Storage in Server Explorer. You will probably have to add your storage account to the list – just right-click on Windows Azure Storage and select Add New Storage Account. You will be prompted for your credentials, and then you can view the content. The content is read-only. You can view the blobs and the rows in the tables.

In summary, Microsoft has provided a development environment that makes it easy for .NET Developers to develop for Windows Azure by integrating the tools into Visual Studio 2010. You can sign-up for a free trial today at www.microsoft.com/cloud/windowsazure.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Robin is currently the Director of Engineering for GoldMail, where she recently migrated their entire infrastructure to Microsoft Azure.

Tom Nolle (@CIMICorp) listed Three questions to ask when considering Microsoft Azure in a 6/3/2011 post to TechTarget’s SearchCloudComputing.com:

Cloud computing is, for some, a means of escaping from the clutches of traditional computer and software vendors. Most enterprises realize that the value of cloud will depend on how well services integrate with their own IT commitments and investments. Because Microsoft is so much a part of internal IT, Microsoft's cloud approach is especially important to users. Many will find it compelling; others may decide it's impossible to adopt. Which camp are you in?

The foundation of Microsoft Azure's value proposition is the notion that users must design their enterprise IT infrastructure for peak load and high reliability operation, although both of these requirements waste a lot of budget dollars. The Azure solution is to draw on cloud computing to fill processing needs that exceed the long-time average. It also backs up application resources to achieve the necessary levels of availability, if normal data center elements can't provide those levels.

This means that Azure, unlike most cloud architectures, has to be based on elastic workload sharing between the enterprise and the cloud. They accomplish this by adopting many service-oriented architecture (SOA) concepts, including workflow management (the Azure Service Bus).

Within Azure, there are multiple sub-platforms that Microsoft calls "Roles." The Web Role provides Internet access to Azure applications and thus allows Azure apps to function essentially as online services. Based on scripting and HTML tools, it is hosted on Microsoft's Internet Information Services (IIS). The Worker Role is a Windows Server executable task that can perform any function desired, including linking an Azure cloud application back to the enterprise data center via Azure Connect. The Virtual Machine Role provides a host for any Windows Server applications that aren't Azure-structured.

This illustrates the Microsoft vision clearly: develop applications that are Azure-aware and exploit the most architected hybrid cloud IT architecture available from a major supplier. Unlike most cloud services that try to create hybrids by joining together resources not designed to be linked, Azure defines a linkable IT architecture. Users who want to deploy that architecture within their data centers would use the Azure Platform Appliance, which makes data centers operate in exactly the same way as Microsoft's. This provides a private cloud and virtualization architecture rolled into one, along with creating a highly scalable and manageable way to improve server utilization and application availability (even if the services of the Azure cloud are not used). [Emphasis added.]

Is Azure for you?
If your data center is substantially based on Windows Server, if you keep your Windows Server licenses current, and if your Windows Server applications are integrated (functionally and in terms of data) largely among themselves, there's a good chance that your whole Windows Server operation could be migrated to a Virtual Machine Role within Azure. From there, it could evolve into a set of Azure-compliant applications.

For most users, this is the basic test: how much of my IT application base is Virtual Machine Role-compatible? If the answer is none, or very little, you probably won't be able to justify an Azure migration. And obviously, a major commitment to Linux or another operating system within the data center will also make Azure unattractive.

A corollary to this: if you have already made a significant commitment to virtualization within your data center, you may find Azure benefits less compelling. Because Azure can improve in-house efficiency as well as provide application backup and offload to the cloud, it can draw on a double benefit case. In-house virtualization, however, may have already captured one of those sets of benefits. Virtualization is also often linked to Linux use, which may mean you don't have a nice Windows Server community to migrate.

The second question to ask is, how much of your application base is either self-developed or based on Azure-compliant software? The presence of a large number of Windows Server applications that can't be made Azure-compliant will erode the value proposition. Azure is unequaled in its ability to flexibly transfer work within an "Azure domain" that includes both your data center (via Azure Connect or the Azure Platform Appliance) and the cloud. The more you can exploit that ability, the better Azure will serve your needs.

The third and final question is, how committed to SOA is your current operation? Azure's AppFabric is essentially a SOA framework, with the Service Bus being equivalent to a SOA Enterprise Service Bus (ESB) extended online. If you've already developed or obtained Microsoft-compatible SOA/ESB software products or components, you're further along toward Azure-specific applications and the optimum Azure benefit case. If you have no SOA implementation or knowledge, the learning curve and software refresh may complicate Azure adoption.

Microsoft's Azure is not an Infrastructure as a Service-like general-purpose cloud platform; it's more like Windows Server in the cloud . For Microsoft shops, that alone is a benefit in terms of skills transfer. By extending the basic Microsoft SOA principles into cloud computing, it may very well provide the best cloud option available.

More precisely, Windows Azure is a Platform as a Service (PaaS) running on customized multi-tenant implementations of Windows Server 2008 R2 and SQL Server 2008 in Microsoft Data Centers around the world.

Tom is president of CIMI Corporation, a strategic consulting firm specializing in telecommunications and data communications since 1982.

Note that the Windows Azure Platform Appliance isn’t available as a product almost a year after its announcement at the Worldwide Partners Conference 2010. WAPA is still a Microsoft skunkworks project.

Full disclosure: I’m a paid contributor to TechTarget’s SearchCloudComputing.com site.

Michael Coté (@cote) explained Cloud = Speed, or, How to do cloud marketing in a 6/2/2011 post to his Redmonk blog:

I’ve sat through, probably, 100′s of presentations, talks, papers, and marketing on cloud at this point: it’s been several years now. There’s a huge variety in the marketing messages, pitches, and explanations for what cloud computing is and why you, dear CIO (or are you?), should spend time and money on us for your Cloud Experience.

After all this time I’ve come up with a theory for cloud marketing: cloud is speed. Everything else supports that simple, clear message, and anything else is distraction and marketing-bloat.

Pardon me as I get hyperbolic to explore the theory here – I explicitly call it a “theory” because I don’t think it’s 100% solid and it definitely doesn’t apply across the board…but let’s pretend for awhile that it does.

Cloud marketing tactics

If you accept the theory that cloud is speed (which we’ll explore below), that means several things for your cloud marketing:

You start by telling your audience that once they start using cloud, they’ll be able to do apply IT to business (make money, more than likely) faster.

That speed means not only that you can pursue opportunities (another word for “money”) faster, but that you can fail faster and thus, “fail towards success” (or “iterate”). If failure is “cheap,” you can use it as a way to learn what success is.

The point of doing all this cloud stuff is to make things faster: if we can’t prove to you (with past case studies and projected mumbo-jumbo) that our technology makes your IT service/software delivery faster, we’re doing the wrong marketing.

You have to build enough trust to have the audience ask for more – you’ve got the gain the benefit of the doubt for a very doubtful message. More than likely, no one is going to believe you: every wave of IT has promised to make things more “agile,” cheaper, and faster. If it worked, why are we here with all this opportunity to speed things up? (Never mind Jevon’s Paradox for now.)

Of course, if you’re of the “every great argument needs an enemy” mindset, that implies another straw-man you need to depict.

The problem with IT is that it’s slow

In order to buy more IT, you must [convince] people there’s a problem, and that it can be solved with what you have to sell. Here, the problem is that traditional IT – “legacy” if you want to be aggressive – is slow by nature. There’s nothing you can do to fix it by addressing the symptoms, you have to change the core sickness. To put it another way, “cloud dusting” won’t work: you’ll end up with the same boat-anchor of IT with just a different management layer on-top.

This is a point Randy Bias makes relentlessly, and rightfully so. [He] and the CloudScalling folks have been wiggling-up a more nuanced, pragmatic argument that exposes the costs of the “legacy cloud” vs. the (my words) “real cloud” that’s worth checking out.

Again, racing to the simple message: cloud is speed. How fast does it take to deploy a new release, IT service, and patch, provision a new box, and so on with your “traditional” setup? How fast does cloud allow you to do it? If a marketer doesn’t immediately prove that cloud is not just faster, but dramatically faster, the whole thing is off.

You want something like this chart from Puppet Labs:

Once you speed up IT, you make more money

Slow IT means the business has to move slower, both missing out on opportunities in hand and missing out on the option to spend time developing new opportunities. If I, as a business, can try out a bunch of different little things quickly and cheaply (see below), I’m not “trapped” in my current shakles of success: “that new idea is all fine and well, but do you have any idea how long it’d take to build it up to even try it out? It takes 4 weeks and $100,000 just to keep them to add a new field to our order form!”

We’ve used the term “business/IT alignment” in this industry a lot over the past few decades. It sounds awesome, and powerful, and like big bonuses: the CEO actually depends on me to help make money…and it works! Business/IT alignment has meant many things when it comes down to the details. Here, it means one thing: speed. By using cloud (the marketing message goes), we can actually respond fast enough to be valuable to the business.

Does this hold water? A survey from Appirio last Fall seems to answer “yes,” at least, you know, among the people who answered:

Have these expectations been realized in actual results? Cloud adopters report that they have. More than 80% of companies that have adopted cloud applications and platforms say that they are now able to respond faster to the business and achieve business objectives. They’ve also found these solutions easier to deploy and cheaper to maintain. The cloud has helped change IT’s role in the business—70% of adopters say that IT is now seen as a business enabler and 77% say that cloud solutions have changed the way they run their business.

(What’s novel about the Appirio survey is that they asked people who’ve already been using cloud stuff, not just what people are “planning” on doing, which most cloud surveys ask – it’s worth submitting yourself to Appirio lead-gen funnel to get the PDF.)

What about cost?

EC2 means anyone with a $10 bill can rent a 10-machine cluster with 1TB of distributed storage for 8 hours. –@mrflip

Costs are a type of friction that slows things down. Having lower costs is table stakes, and if your cloud offering isn’t at least affordable, it’s going to take longer to catch on. Once something is cheap, I can do more of it, more frequently, meaning I can try out more things, explore more options, and – yup – move faster.

If I can spin up a super computer in hours rather than months (or minutes!) for thousands rather than millions, I can achieve a huge amount of speed because I can do more, at lower cost, more frequently. Lowering costs for the sake of lowering costs is only valuable for IT when there’s nothing new gained with the new technology. “Commoditized” IT is what fits here: x86 boxes, email without Enterprise 2.0 bells and whistles, backup software. As a counter example, notice how Apple is able to build up brand to not do that: also notice how “closed” their whole brand ecosystem is.

Controlling costs, then, is something that support speed. If cloud was the same price as traditional IT, or more expensive, it would slow down the rate you could use it (unless you have unlimited budget, e.g., spies, scientists, and other past customers of super-computers).

(There’s a sub-argument to be made that lower costs “democratized” the technology, like open source did the Java application servers and middle-ware, and later software development in general. But, at this high-level, that’s details for further discussion.)

Building trust so you can get to the boring stuff

The most important thing you need to do for cloud marketing is to make people actually believe you. The goal is to get the benefit of the doubt enough to be asked to speak for a few more hours on the topic. The stretch goal, if you’re a public cloud thing is to get people to trust you enough to sign up for the service, for example, to try the Opscode Platform trial, do some “Hello world!”ing on Heroku, putter around on GitHub, or just mess about in EC2. Sadly, most people with high dollar cloud stuff to market don’t have that luxury as they’re selling private clouds, which require, you know, the usual PoCs and such, if only in hardware acquisition and network/security setup. Appliances can go a long way here, of course. But, back to that first goal: being asked back to further educate the prospect.

While cloud gasbags like myself may bore of the whole “what is a cloud” talk, when I talk with the cloud marketers in the trenches, that’s a whole lot of what people want. What exactly does your vision of the cloud mean and how does it apply to me? And let’s be frank, if you, a high paid marketer are involved, you need to sell big-ticket, transformative projects, not tiny things (marketing to the masses is an entirely different set of tactics). This isn’t a cynical, “a sucker is born every minute” take, it’s realizing that if IT is to be a core asset for business, it’s going to be a big deal. The Big Switch hasn’t happened just yet.

How can you build this trust? First, focus on what’s different this time. Explain what has changed technologically to make this speed with cloud possible. Again: “agility” is something IT is supposed to do and everything laughs at. Here are some stock, “what’s different this time”‘s:

New options are available: Amazon, Rackspace, GoGrid, and all the other public clouds are new, different ways to run IT. They’re fundamentally different: we’re not just installing magic software on-top of it, the way you manage the hardware, the datacenters, the network, etc. is different.

Moore’s law has delivered: infrastructure is cheaper and faster (and somehow outpacing Jevon’s Paradox, I guess – though certainly not on my desktop!)

Once you move to this new model and change the way you develop and manage applications, you can start doing things differently like Frequent Functionality (delivering smaller chunks more often), observing user behavior (a million one way mirrors), and other useful killer features.

Thanks to consumer web apps and the iPhone-led renaissance in smart phones, people don’t expect IT to suck as much. It’s the old “if Facebook can do it so easily, why can’t the IT department.” This trend is important, because it means there’s demand for IT to suck less – put another way, they have a reason to change – put another way: rouge IT is a very real competition, see that ticker symbol CRM.

There’s more, of course: but to win the benefit of the doubt, you have to explain why it’s different this time, why it will work. All you want to achieve is getting asked back for more and if you can make them believe that cloud is speed, they might just ask for further discussion. And, it of course helps to no end if there’s plenty of examples, but you don’t always have that luxury.

Everything else

Obviously, there’s more details, but I want to keep the theory simple: cloud is speed. Once you go down that rat-hole, the other valuable things like dev/ops, the types of applications you drive on-top of cloud, how you change your deliver model, and where your offerings fits in on the SaaS/PaaS/IaaS burger all come next. The biggest argument of all is over public vs. private. The idea that you need private has all but won, with “security” and special snow-flake concerns weighing too heavily on decision maker’s minds.

But all those things come from what you, the marketer, is actually trying to sell beyond simply speeding things up.

(Here’s some more details and rat-holes in lovely mind map form for those who want to dig deeper.)

Disclosure: Puppet Labs, Opscode, and others mentioned above or relevant are clients. See the RedMonk client list.

Read more: http://www.redmonk.com/cote/2011/06/02/how-to-do-cloud-marketing/#ixzz1OENWDPuz

Visual Studio LightSwitch and Entity Framework v4+

••• Michael Washington (@ADefWebserver) described LightSwitch Concurrency Checking on 6/4/2011 with a link to an online demo:

One of the greatest benefits to using LightSwitch, is that it automatically manages data integrity when multiple users are updating data. It also provides a method to resolve any errors that it detects.

The first issue is very important, and with most web applications, it is not handled. Simply, the last person to save a record overwrites any other changes, even if those other changes were made after the user initially pulled up the record. The second issue, a method to resolve this situation, is priceless, because the code to create the “conflict resolution process” is considerable.

This blog post does not contain any code examples, because there is no code for you to write to get all these features!

Log in and use the Things For Sale example located here: http://lightswitchhelpwebsite.com/Demos/ThingsForSale.aspx

When we first pull up an existing Post, the Price of the car is $6,500.00.

Change the price to $7,500.00. Do not save yet.

Open a new web browser window and pull up the record, the price is still $6,500.00.

Go back to the first window, and save the record (the price in the database is now $7,500.00).

Now, go to the second window and try to save, you will get a box that detects that the data has changed since the last time you saw it.

Most importantly, it detects and highlights the discrepancy of each individual field, and provides a mechanism to resolve each one.

LightSwitch, Because Your Business Data Is Important

Your business data is important. This sort of integrity checking is expected of web professionals, but try this experiment on your web application, and don’t be surprised if it simply uses the last value.

The reason this is important, is that the second to last user will report to management that they updated the Price of the car. The last user will tell management that they were in the system and never saw the Price change. The programmers will then be instructed to implement logging. If they had logging, they would be able to determine what went wrong, but that would not prevent the problem from happening again.

By using LightSwitch, you avoid the problem in the first place.

Microsoft Access provides similar detection of potential concurrency errors. The problem, of course, is that the last user to change the value determines whether his or her change is saved.

• Steve Yi announced the availability of a 00:08:41 Video How-To: Creating Line of Business Applications using SQL Azure Webcast in a 6/3/2011 post to the SQL Azure Team blog:

This walkthrough explains how to easily create Line of Business (LOB) applications for the cloud by using Visual Studio LightSwitch and SQL Azure. The video highlights the benefits of Visual Studio LightSwitch and explains how you can quickly create powerful and interactive Silverlight applications with little or no code.

The demo portion of the video shows how to build an expense application and deploy it using the Windows Azure portal. The conclusion points you to additional resources to help you get started creating your own applications.

Please take a look and if you have any questions, leave a comment. We have [several] other ~~several~~ great how-to videos and code samples available on the SQL Azure CodePlex site at: http://sqlazure.codeplex.com.

Michael Washington’s Microsoft Visual Studio LightSwitch Help Website added a Marketplace section, discovered on 6/3/2011:

Hopefully, the Marketplace will incorporate more product categories in the future.

Julie Lerman (@julielerman) wrote Demystifying Entity Framework Strategies: Loading Related Data for MSDN Magazine’s June 2011 issue:

In last month’s Data Points column, I provided some high-level guidance for choosing a modeling workflow strategy from the Database First, Model First and Code First options. This month, I’ll cover another important choice you’ll need to make: how to retrieve related data from your database. You can use eager loading, explicit loading, lazy loading or even query projections.

This won’t be a one-time decision, however, because different scenarios in your application may require different data-loading strategies. Therefore, it’s good to be aware of each strategy so that you can choose the right one for the job.

As an example, let’s say you have an application that keeps track of family pets and your model has a Family class and a Pet class with a one-to-many relationship between Family and Pet. Say you want to retrieve information about a family and their pets.

In the next column, I’ll continue this series by addressing the various choices you have for querying the Entity Framework using LINQ to Entities, Entity SQL and variations on each of those options. But in this column, I’ll use only LINQ to Entities for each of the examples.

Eager Loading in a Single Database Trip

Eager loading lets you bring all of the data back from the database in one trip. The Entity Framework provides the Include method to enable this. Include takes a string representing a navigation path to related data. Here’s an example of the Include method that will return graphs, each containing a Family and a collection of their Pets:
from f in context.Families.Include("Pets") select f
If your model has another entity called VetVisit and that has a one-to-many relationship with Pet, you can bring families, their pets and their pet’s vet visits back all at once:
from f in context.Families.Include("Pets.VetVisits") select f
Results of eager loading are returned as object graphs, as shown in Figure 1.

Figure 1 Object Graph Returned by Eager Loading Query

Include is pretty flexible. You can use multiple navigation paths at once and you can also navigate to parent entities or through many-to-many relationships.

Eager loading with Include is very convenient, indeed, but if overused—if you put many Includes in a single query or many navigation paths in a single Include—it can detract from query performance pretty rapidly. The native query that the Entity Framework builds will have many joins, and to accommodate returning your requested graphs, the shape of the database results might be much more complex than necessary or may return more results than necessary. This doesn’t mean you should avoid Include, but that you should profile your Entity Framework queries to be sure that you aren’t generating poorly performing queries. In cases where the native queries are particularly gruesome, you should rethink your query strategy for the area of the application that’s generating that query. …

Julie continues with more loading database trips topics.

Julie is a Microsoft MVP, .NET mentor and consultant who lives in the hills of Vermont.

Full disclosure: 1105 Media publishes MSDN Magazine and I’m a contributing editor of their Visual Studio Magazine

Return to section navigation list>

Windows Azure Infrastructure and DevOps

••• Simon Munro (@simonmunro) described Ten things Azure should do to win developers in a 5/13/2011 post to the Cloud Comments.net blog:

Developers, although not decision makers, are key influencers in any IT strategy and Microsoft is doing their best to win them over to Windows Azure. Unfortunately they don’t seem to be all that interested, despite having offers of free compute hours thrown at them. Here are some things that Scott Guthrie can give attention over the next couple of years.

1. Keep the on premise and Windows Azure API the same. Building loosely coupled asynchronous systems on Windows Azure uses worker roles and queues. On Windows there are some choices – services that poll MS Message Queues or WCF Messaging. All three are completely and unnecessarily different.

2. Release Fast. The traditional Microsoft 12 month release cycle isn’t good enough. Amazon is releasing stuff every week and other stacks have open source developers committing to the frameworks.

3. Sort out deployment. Go look at Chef, Puppet, Capistrano, Cucumber, Git and a whole lot of other tools. Then go and look at MS Build, TFS, Powershell and Azure with Visual Studio. Come on Microsoft, pick something that works and make it work across all your platforms.

4. Build the services that people need. Web apps need MapReduce, document databases, search engines (SOLR) and other bits that cannot be run on Azure. People need this stuff. Either get it built or watch them walk away.

5. Bring Azure Technologies On Premise. The way to store binary data on Azure is to use blob storage. On premise it is to use the file system. These two are too different to be easily portable. Why not build a distributed file system for Windows that shares an API with Windows Azure?

6. Get the basics right. How do you run a scheduled task on Windows Azure (cron jobs)? Let me ask that differently. How to you run scheduled tasks that are not a complete hack?

7. Allow developers to make optimal use of VMs. Running up an instance for every worker role is potentially a waste of resources and the granularity of processes to virtual machines is mismatched. Mismatched in favour of Microsoft making lots of money, that is.

8. Listen to Developers. Listening to enterprise customers is okay for how to build on premise apps. But for the cloud, you need to listen to developers. They’re the ones building stuff and making it work, not worrying about their investment in 90′s ERP systems.

That is only 8. 9 and 10 are the same as 1. The API of Windows Azure needs to be the same as on premise .NET development and portable between the platforms. Microsoft has all the bits and people, so there is no excuse that there are two different paradigms.

For unanswered pleas, go to http://www.mygreatwindowsazureidea.com.

I made a similar point in the opening paragraphs of my New Migration Paths to the Microsoft Cloud cover article for the June 2011 issue of Visual Studio Magazine, which went live on 6/1/2011:

Scott Guthrie, the former corporate vice president of the .NET Developer Platform, who's worked on ASP.NET since it was alpha code, will begin the toughest challenge of his career when he assumes control of the Azure Application Platform team, starting this month. The Windows Azure platform will face heightened scrutiny from Microsoft top management after a major reorganization of the company's Server and Tools Business (STB) group and the Developer Division, which is part of the STB. The re-org was announced internally in May [2011].

Microsoft CEO Steve Ballmer -- along with some others on his leadership team -- appears to be disappointed in Windows Azure uptake by .NET developers during its first year of commercial availability. Gaining developer mindshare requires leveraging their current investment in .NET programming skills by minimizing application architecture and coding differences between traditional ASP.NET projects and Windows Azure Web Roles. Convincing developers and their IT managers to sign up for pay-per-use Windows Azure subscriptions necessitates proof of quick and easy migration of existing ASP.NET Web apps from on-premises or traditionally hosted servers to Microsoft datacenters. …

•• Stefan Reid and John R. Rymer (@johnrrymer) updated their The Forrester Wave™: Platform-As-A-Service For Vendor Strategy Professionals, Q2 2011 research report on 6/2/2011:

Identifying The Best Partner Choices For ISVs And Service Providers

From the Executive Summary:

Platform-as-a-service (PaaS) offerings represent a critical space within the broader cloud ecosystem, as they provide the linkage between application platforms and underlying cloud infrastructures. In order to build a viable cloud market strategy, vendor strategists of independent software vendors (ISVs) and tech service providers need to understand their partnership opportunities in this area. In this report, we outline how Forrester's 149-criteria evaluation of 10 PaaS vendors can be used to determine the best partner choices.

Our research unveils that salesforce.com, because of its comprehensive PaaS features and strong vision and strategy, represents the top choice for ISVs. For service providers, we see strong performances of Cordys, LongJump, Microsoft, salesforce.com, and WorkXpress, which support multiple deployment and business models. Although PaaS overall is still in the early stages of its evolution with lots of potential risks for buyers, without a strong set of PaaS choices, vendor strategy professionals will struggle to make a safe infrastructure bet for their SaaS application platforms or local ecosystems. [Emphasis added.]

TABLE OF CONTENTS

PaaS: The Key To Unlock Cloud Computing's Power

PaaS: An Immature Market With A Fragmented Vendor Landscape

Platform-As-A-Service Evaluation Overview

PaaS: The ISV Partner Scenario

PaaS: The Service Provider Partner Scenario

Vendor Profiles

Supplemental Material

Related Research Documents

Features

Forrester Wave™: Platform-As-A-Service, ISV Scenario, Q2 '11

Forrester Wave™: Platform-As-A-Service, ISV Scenario, Q2 '11

Forrester Wave™: Platform-As-A-Service, Service Provider Scenario, Q2 '11

Forrester Wave™: Platform-As-A-Service, Service Provider Scenario, Q2 '11

The report sells for US$2,495.

I’m not sanguine about Forrester’s selection of salesforce.com as the “top choice for ISVs.” Windows Azure provides considerably more support for ISV developers than salesforce.com.

•• Louis Columbus (@LouisColumbus) analyzed an earlier Forrester report in his Sizing the Public Cloud Computing Market post of 6/1/2011:

Forecasting the global public cloud market is growing from $25.5B in 2011 to $159.3B in 2020 in the report Sizing the Cloud, Understanding And Quantifying the Future of Cloud Computing (April, 2011), Forrester Research has taken on the ambitious task of forecasting each subsegment of their cloud taxonomy. Forrester defines the public cloud as IT resources that are delivered as services via the public Internet in a standardized, self-service and pay-per-use way. The aggregate results of their forecasts are shown in the attached graphic.

The forecast range is from 2008 to 2020 and I’ve included several of the highlights from the study below:

Forrester breaks out Business Process-as-a-Service (BPaaS) in their public cloud taxonomy, not aggregating this area of cloud computing into IaaS or PaaS. This is unique as other research firms have not broken out this component in their cloud market taxonomies, choosing to include Business Process Management (BPM) as part of either infrastructure-as-a-service (IaaS) or platform-as-a-service (PaaS) subsegments. Forrester is predicting this category will grow from $800M in 2012 to $10.02B in 2020.

SaaS is quickly becoming a catalyst of PaaS and IaaS growth, growing from $33B in 2012 to $132.5B in 2020, representing 26% of the total packaged software market by 2016. Forrester is predicting that SaaS will also be the primary innovative force in public cloud adoption, creating applications that can be tailored at the user level. Forrester is bullish on public cloud growth overall, and their optimistic outlook can be attributed to the assumption of cloud-based applications being configurable at the user level, with little to no enterprise-wide customization required.

PaaS is forecasted to grow from $2.08B in 2012 to $11.91B in 2020. Forrester is defining PaaS as a complete preintegrated platform used for the development and operations of general purpose business applications. The research firm sees the primary growth catalyst of PaaS being corporate application development beginning this year. By the end of the forecast period, 2020, up to 15% of all corporate application development will be on this platform according to the report findings.

IaaS will experience rapid commoditization during the forecast period, declining after 2014. Forrester reports that this is the second-largest public cloud subsegment today globally, valued at $2.9B, projected to grow to $5.85B by 2015. After that point in the forecast, Forester predicts consolidation and commoditization in the market, leading to a forecast of $4.7B in 2020.

I believe Forrester is conservative with its US$12 billion prediction for PaaS revenue in 2020, less than 10% of the SaaS market. Improved management services will make PaaS the preferred service for enterprise and many consumer services during the next nine years. Aggregating Business Process as a Service, such as CRM and ERP, with PaaS makes better sense to me.

• The Windows Azure OS Updates team reported Windows Azure Guest OS 1.13 (Release 201104-01) on 6/3/2011:

The following table describes release 201013-01 of the Windows Azure Guest OS 1.13:

Friendly name

Windows Azure Guest OS 1.13 (Release 201104-01)

Configuration value

WA-GUEST-OS-1.13_201104-01

Release date

June 3, 2011

Features

Stability and security patch fixes applicable to Windows Azure OS.

Security Patches

This release includes the following security patches, as well as all of the security patches provided by previous releases of the Windows Azure Guest OS:

Bulletin ID

Parent KB

Vulnerability Description

MS11-018

2497640

Cumulative Security Update for Internet Explorer

MS11-019

2511455

Vulnerabilities in SMB Client Could Allow Remote Code Execution

MS11-020

2504829

Vulnerability in SMB Server Could Allow Remote Code Execution

MS11-026

2503658

Vulnerability in MHTML Could Allow Information Disclosure

MS11-027

2508272

Cumulative Security Update for ActiveX Kill Bits

MS11-028

2484015

Vulnerability in .NET Framework Could Allow Remote Code Execution

MS11-029

2489979

Vulnerability in GDI+ Could Allow Remote Code Execution

MS11-030

2509953

Vulnerability in DNS Resolution Could Allow Remote Code Execution

MS11-031

2510587

Vulnerability in JScript and VBScript Scripting Engines Could Allow Remote Code Execution

MS11-032

2507818

Vulnerability in the OpenType Compact Font Format (CFF) Driver Could Allow Remote Code Execution

MS11-034

2506223

Vulnerabilities in Windows Kernel-Mode Drivers Could Allow Elevation of Privilege

2524375

Fraudulent Digital Certificates could allow spoofing

Note: When a new release of the Windows Azure Guest OS is published, it can take several days for it to fully propagate across Windows Azure. If your service is configured for auto-upgrade, it will be upgraded sometime after the release date, and you’ll see the new guest OS version listed for your service. If you are upgrading your service manually, the new guest OS will be available for you to upgrade your service once the full roll-out of the guest OS to Windows Azure is complete.

David Linthicum (@DavidLinthicum) asserted “The industry's dirty little secret: Cost savings aren't the actual value of the cloud” in a deck for his The cloud won't save you money -- and that's OK in a 6/3/2011 article for InfoWorld’s Cloud Computing blog:

When you think of cloud computing, you probably think of lower costs. However, a recent Forrester Research report concluded that companies now spend as much on new projects as on ongoing operations. I see that in the field as well, both in formerly mothballed projects that are brought back to life and in the many new projects that kick off each week.

There are a few factors at work. First, we seem to be in a recovery. Second, we've avoided spending, so there is a huge application backlog. Finally, and perhaps the largest driver, is the movement to cloud computing and the mobile applications that leverage cloud computing.

What's strange is that we've promoted cloud computing as a means to reduce IT spending, yet it's causing the opposite, at least initially. However, instead of hardware and software costs, enterprises are buying cloud services and high-end consulting services, and they're hiring anybody out there who knows what AWS stands for (Amazon Web Services, by the way). The bubble is beginning to inflate, and the spending is rising sharply, thanks to the backlogs and the cloud.

What you need to watch out for is not the spending, but the value delivered. The dirty little secret in the world of cloud computing is that operational cost savings do not provide the value. Rather, the value comes in the operational agility of the cloud-based applications, considering elasticity and agility.

Although many cloud projects are sold as cost-reduction efforts, they often do not provide cost savings. You need to approach them instead for what they really are: strategic investments. Understand that the high initial spending at the front end will lead to a huge ROI and value in the back end -- at least if you're doing cloud computing the right way.

Matthew Weinberger (@MattNLM) reported AMD Survey: Cloud Is Exploding, Public Sector Faces Hurdles on 6/3/2011 for the TalkinCloud blog:

Chip manufacturer AMD has released the results of a global survey designed to identify key trends in cloud computing adoption across both the public and private sectors. The good news for cloud providers: 70 percent of organizations surveyed are either already in the cloud or are planning a migration. And 60 percent of those who made the jump are already seeing business value. The bad news: Despite the hype, much of the public sector is still reluctant to jump in.

I know this is going to be the first question asked in the comments, so let me just start by saying that the full AMD report has all the pertinent details on its surveying methodology and sample sizes.

Here’s the list of survey highlights, taken directly from AMD’s press release:

The State of Cloud Deployments: Seventy-four percent of U.S. organizations are using or investigating cloud solutions, followed by 68 percent in Asia and 58 percent in Europe.

Trust in the Cloud: Nearly 1 in 10 organizations in the U.S. estimate they store more than $10 million worth of data in the cloud. However, 63 percent of global respondents still view security as one of the greatest risks associated with the model.

Understanding and Preparation for the Cloud: For those currently using the cloud, 75 percent had the necessary IT skills to implement the solution versus only 39 percent of those who are currently investigating cloud today.

Cloud Clients: Cloud users are able to access their services primarily via a PC (90 percent), followed by smartphone (56 percent), tablet (37 percent) and thin client (32 percent).

Those numbers definitely justify a lot of optimism in the cloud services marketplace. Enterprises of all sizes are realizing the benefits the cloud brings, and those numbers are only going to rise as buzz continues to build.

That said, AMD’s findings indicate that the public sector isn’t so gung-ho about the cloud. While almost half of public respondents indicated that budget constraints were driving cloud adoption, it seems there’s a lack of IT expertise that’s still keeping many from taking the plunge — 43 percent of public sector respondents said they didn’t have the skills to perform a cloud deployment, compared with 23 percent in the private sector.

Of course, AMD couldn’t resist hyping its own cloud value-add, hinting that the weeks to come will bring insight into how the company is helping customers and partners efficiently and cost-effectively implement cloud solutions with updates to the AMD Opteron 4000 platform. With rival Intel making its own cloud chatter, it’s definitely going to be interesting to see how this plays out. Stay tuned for more updates.

Windows Azure Platform Appliance (WAPA), Hyper-V and Private/Hybrid Clouds

••• Joe Hummel posted The Future of Windows HPC Server? to his PluralSight blog on 6/3/2011:

I'm attending Cloud Futures on the Redmond campus this week --- curious what people are doing with Azure, and where the cloud is going. This morning Mark Russinovich (of sysinternals fame, among many accomplishments) gave a keynote on "Windows Azure Internals". Listening to the keynote, something clicked, and gave me some possible insight into something else I've been wondering about of late: The future of Microsoft's Windows HPC Server product? First some recent history...

As you probably know, in early May Microsoft went through a reorganization, and of interest to me was the fact that the Technical Computing group at Microsoft was dissolved, and the HPC team was moved under the Azure team. So HPC was merging with Azure.

Listening to Mark's keynote, it dawned on me that we probably don't need a distinct HPC Server product anymore: Azure contains a job scheduler, knows how to manage resources, and can allocate a set of nodes in the same rack of some datacenter to give you an on-demand cluster hosted entirely in the cloud. And if Azure adds some InfiniBand support, GPUs, and fast local storage, this could be a very nice cluster indeed. Need an on-premise cluster for security or data reasons? Then you install Azure locally in the form of "Windows Azure Platform Appliance". [Emphasis added.]

So take this with a huge grain of salt (as an academic, my business prediction confidence level is near 0 :-), but it seems logical that Windows HPC Server will disappear as we know it, and rolled into Azure. The good news is that the *concepts* of HPC --- exposing parallelism, data locality, OpenMP, TPL, PPL, MPI, SOA, Excel, Dryad, parametric sweep, etc. etc. --- will live on, unchanged. So no matter what happens in the end, continue to think in parallel, and all will be well :-)

I’ll take “Then you install Azure locally in the form of ‘Windows Azure Platform Appliance’” with a box of salt until it RTMs and gets a SKU.

•• Brian Harris provides a brief (00:02:39) overview of the Dell Cloud Solution for Web Applications in this Webcast:

Dell’s new “SmartOS” supports PHP, Java and Ruby for Web apps but not .NET. Strange, when you consider all the press devoted about a year ago to Dell’s implementation of the Windows Azure Platform Appliance.

Dell’s vague Windows Azure Technology From Dell page mentions WAPA but gives no specifics about availability.

Dustin Amrhein (@damrhein) listed Four common obstacles when implementing a private cloud as an introduction to his Challenges Facing the Enterprise Private Cloud post of 6/3/2011 to The Cloud Computing Journal:

It seems like the last several months have brought a rapid increase in the number of organizations getting serious, really serious, about private clouds in their enterprise. By this, I mean they are going beyond working on ROI documents, formulating strategies, and doing referential research. They are starting to put their preparation to good use in the form of implementation work. Maybe it just so happens that many of the customers I work with are arriving at this phase at the same time, but I would wager a bet that this is more or less happening in many companies.

Luckily, I have been able to be a part of the private cloud rollout for many of my customers, from the initial strategy and architecture talks to the implementation plans. One thing that has become clear is that regardless the industry or size of the company, there are technical challenges lying in the grass ready to attack the most well planned implementations. Having seen more than a few of these rollouts, I thought I would share four commonly encountered challenges, along with some insight into each:

1) Image management: In many private clouds, virtual images are the foundation on top of which users will build value. Effectively and efficiently managing the virtual images that you will need for your cloud is no small task. First, you need to decide on a centralized repository for your images that provides governance in the form of version control, fine-grained access, change history, and much more. Moreover, this repository should provide easy integration to the component(s) that will drive provisioning into your private cloud. Advanced image management solutions will go further by decoupling base binary images from the minor configuration changes needed by various users of that image. This will serve to keep the number of images in your repository to a minimum, thereby significantly reducing management overhead.

2) Service management: While many private cloud initiatives start small and tackle a specific problem in the enterprise, expanding its reach to other areas in the company is usually a medium to long-term goal. Once the private cloud expands beyond a relatively narrow scope, organizations must tackle the challenge of effectively managing the various services offered on the cloud. In other words, the need for a service catalog becomes quite pronounced. In adopting a service catalog, be on the lookout for a few key capabilities. Like with the image management solution, you want to have basic governance capabilities, but you also need to have a meaningful way to organize and surface services to end-users. You need to be able to impose degrees of service discoverability to users based on their organizational role. In addition, you should be able to define SLAs and associate them with services in your catalog.

3) Self-service access: This sounds simple. I mean, you want to put a web front-end on the services in your cloud and let users easily provision services, right? Well, that may be the end goal, but there are usually other considerations when we discuss self-service in an enterprise's private cloud. It goes beyond having proper access controls, since that is mere table stakes. Usually self-service access in the enterprise means properly integrating with ticket request solutions already in place. Enterprise users should definitely be able to easily request services, but that should not come at the expense of having a measure of control over what happens upon receipt of that request. I am not saying that every request for a cloud-based service requires human approval, but it is important that the organization be able to submit that request to basic rules that decide what should happen. This may result in automatic, immediate provisioning, or it may mean that an administrator needs to take a closer look.

4) Meaningful elasticity: One of the most eye-catching, intriguing aspects of cloud is probably the idea of elasticity. Users get exactly the right amount of resource at exactly the right time. Who wouldn't like that idea? The trick is that unmanaged elasticity may not be what every enterprise wants. Depending on the situation or the service, the enterprise may want to subject elasticity events to a rules engine that decides whether the system really should grow or shrink. To be clear, I do not mean a rules engine that looks at technical factors (CPU, memory, storage, etc.). I am talking about an engine that allows the enterprise to subject elasticity events to business rules. For example, companies should be able to dictate that scaling out a poorly performing back-office application should not adversely affect the performance of a revenue generating application.

These are just a sampling of some of the recurring challenges I run across these days. Personally, I think these challenges are what make working in this space exciting. Both providers and consumers are in the mode of forging a new road for IT. Will there be some bumps in the road? Yes. Can we solve them? Yes, and that is where the fun is!

Dustin joined IBM as a member of the development team for WebSphere Application Server.

Mike Maciag (@gmmaciag) called Private Clouds for Dev-Test: Solving the Struggle Between IT and Dev “The familiar struggle” in a 6/2/2011 post to the Cloud Computing Journal:

It's an age-old struggle. IT works hard to provide top-of-the-line infrastructure, while developers juggle the build-test-deploy cycle. Somewhere in the middle things get lost in translation and the two find themselves at odds. Developers vie for control of resources and access to tools, while IT struggles to provide resources that are standardized and can be managed in a secure and consistent way.

Sound familiar? We see it all the time in the enterprises we work with. It really boils down to seemingly disparate goals. Even though they're working for the same organization, IT and developers are trying to achieve different things. IT strives for efficient use of resources to get a better return on infrastructure investment. Software developers, on the other hand, are concerned mainly with efficient development. They demand ready access to infrastructure and need a wide range of tools at their fingertips. The constantly fluctuating demands and hodgepodge of tools make it difficult for IT to keep up with their resource needs, and maintain security and consistency within the organization. It's no wonder they always seem to be butting heads.

A New Approach
With the advent of cloud computing, there's new hope of bridging the gap between IT and development. The processes that benefit most from moving to the cloud are those that are resource-intensive or "bursty" in compute demand - exactly the kinds of processes that abound in the build-test-deploy cycle. Great examples include compiling and building source code, testing on several different operating systems, and load testing.

But is a basic cloud implementation enough to address the Dev - IT divide?

Most cloud implementations leverage virtualization and user self-service as their two cornerstone technologies. Virtualization dramatically improves the utilization of the underlying resource; now your underutilized physical servers can be loaded up with many virtual machines (VM), improving your asset utilization. Virtualization also allows IT to provide standardized resources as templates s servers, applications, databases, etc., to users, which enables fast setup and consistent management of the resources. Self-service gives users IT resources on demand: they can request a new server and voila - a new virtual machine is provided instantaneously. Because they don't have to wait for hours or days to get the compute services they need, productivity and time-to-market can be improved.

While cloud provides a lot of value, it still doesn't address the way development wants to interact with IT. Most development teams today have a software production process, a workflow that starts with developers writing software code, building and testing the software, and culminates with the release/deployment of the customer-ready software. This process, complex to begin with, is becoming even more complex with the adoption of Agile development methodologies that encourage faster and more iterative development of software applications. To improve the productivity and efficiency of development in this fast-evolving landscape, cloud infrastructure and services need to be tightly integrated to the process. The necessary ingredients of this integration include:

Automated, but seamless, self-service: Developers want self-service, but not in the typical Web interface sense that limits them to setting up one resource at a time. In today's fast-paced and Agile software production process, developers need the ability to set up resources instantaneously and in-context of the software production process. For example, developers want build systems to be automatically provisioned upon start of a build process, and torn down upon successful completion of the process. It is imperative that this process is seamless so it can be iterated multiple times a day; it's also important to automate the process so IT doesn't have to deal with VM sprawl or orphaned VM issues.

Customized resource and environment: While the cloud provides standard IT compute resources, developers typically want to customize the resources to the requirements of the software production process. This may involve configuring the standard IT-instantiated resources deploying new dev/test-specific applications. Just as important, developers want these changes to be done automatically without manual interventions

Automatic resource management: Developers want the cloud solution to automatically manage the cloud resource, whether it means creation, deletion or active management of the cloud workload and resources. This lets them focus on what they do best and, more important, it enables the IT organization to manage cloud resources in an optimal and efficient manner and achieve shared service economies.

Visibility: Developers want their solution to provide them with end-to-end visibility into the software production process and the resources that these processes run on. Whether the process is running on physical, virtual or cloud resources, developers are looking for good analytics to quickly triage software production process errors (which build was broken, what software version passed the tests, etc.).

Flexibility: Finally, while developers want to leverage the cloud, they don't want to be locked in to any one resource choice. The development team wants to retain the option of using physical, virtual or cloud services (private or public) to best fit their production process.

Today's cloud solutions do a great job of managing the cloud resources from an infrastructure perspective:

Lab management solutions, such as VMWare's Lab Manager or Citrix VMlogix, manage the VMs and standardized templates that are used by development teams

Cloud Infrastructure as a Service (IaaS) solutions such as Eucalyptus provision and manage the infrastructure that's used by the development team

Amazon EC2 and Rightscale provide capabilities to use the public cloud to build and test software applications.

Many of these solutions, however, don't understand the development process /tasks that are run on cloud resources, and it is precisely the lack of these types of integrations that is preventing development teams from widespread adoption of cloud technology. One piece that fills the self-service gap to enable development on the private cloud is a software workflow automation system like Electric Cloud ElectricCommander.

The Outcome
Let's talk about how this looks in practice. Say I need to do some system testing. I'm going to need a bunch of machines. If my IT organization supports virtualization technologies, such as VMware Vsphere or Microsoft HyperV, I can get a number of virtual machines with a specific software configuration on them. This is a big improvement over the old days when I would need to secure physical machines, but it's still the virtual equivalent of a blank rack of servers. I've provisioned the resources, but I haven't provisioned the actual test applications. At this point, I have self-service compute resources, but I don't have a system in place that determines what needs to happen, what workflows it needs to go through, how I'm going to load the software or how I'm going to integrate the tools.

By implementing a software production automation system that is optimized for use on the cloud, I now have a platform that lets me define the steps, the workflow between them, tool integration and resource management.

This solution would provide workflow automation (automating, parallelizing and distributing steps within the workflow), seamless services (automatically setting up and tearing down resources as tasks demand), dev tool integration and end-to-end visibility and reporting (aggregating data from multiple apps to quickly identify software errors).

How It Plays It Out in the "Real World"
One of our customers, a large financial institution, has a development team of more than 5,000 developers spread around the world. They've long employed Agile practices, including continuous integration and test processes, but as the development team grew, its demands overwhelmed the script- and open source-based software build and test system they had relied on. Because individual teams were allowed the discretion to choose development methods and tools that worked best for them, the organization was dealing with a wide variety of tools that had become difficult to manage.

A private development cloud turned out to be an essential part of the solution for this organization. It allowed them to offer software build and test as a service to developers, while staying behind their firewall to maintain the tight security the financial industry demands. They now have a common pool of resources to support build, test and deploy procedures that are always accessible on-demand. Teams are still using the tools they prefer, but the organization can now easily allocate resources as they are needed, while supporting parallel builds across multiple computers with varied operating systems and languages.

This customer implemented the private development cloud as an opt-in service, letting teams choose whether to use it or continue to run builds and tests locally. But as teams began to see the benefits, they were eager to move to the cloud. IT is happier too: managing resources while accommodating development's varied tools is now easier, and they have much better visibility into the development process, which is invaluable for a financial company that has to be ready for audits.

At a macro level, implementing a private development cloud has allowed this organization to increase their productivity and save money. The developers, though, aren't thinking of it as an ROI - they're just glad to have a system that helps them do their jobs as efficiently and easily as possible.

Moving development to the cloud and enabling self-service allows developers and IT to work together more easily, with the end result they're all ultimately looking for: better software that's built, tested and deployed cheaper and faster. It seems the cloud holds the key to ending that age-old struggle once and for all.

Alan Le Marquand posted Announcing the Release of the ‘Microsoft Virtualization for the VMware Professional – VDI” course to the Microsoft Virtual Academy on 6/2/2011:

The Microsoft Virtual Academy team would like to announce the release of the Microsoft Virtualization for the VMware Professional - VDI course.

Centralizing desktops and client computers is an increasing important consideration for all IT departments as they begin to evaluate Virtual Desktop Infrastructure (VDI). Using VDI to consolidate maintenance activities reduces the time which end-users must spend on OS and application deployment, configuration, patching and compliance, while decreasing hardware costs through virtualized resource pooling and sharing.

The VDI course provides a deep-dive into VDI planning and solutions as the final section in the three track program covering Microsoft Virtualization. Learn about:-

When to use VDI.

Planning considerations.

Desktop models.

Windows 7 integration.

Application delivery.

User state virtualization

Comparisons to other technologies.

Also in this course you will explore how Microsoft’s v-Alliance partnership with Citrix strengthens and broadens the VDI offerings.

Upon completing this course you will be able to understand, plan and deploy the appropriate VDI solution for your business and also gain 47 MVA points towards your next level.

Sign in and take this course today!

James Staten (@staten7) posted Jumpstart Your Private Cloud: Good Vendor Solutions Abound to his Forrester Research blog on 5/18/2011 (missed when posted):

Forrester surveys show that enterprise infrastructure and operations (I&O) teams that are well down the virtualization path are shifting their priorities to deploying a private cloud. While you can certainly build your own, you don’t have to anymore. There’s an abundance of vendor solutions that can make this easier. In response to Forrester client requests for help in selecting the right vendor for their needs, we've published our first market overview of private cloud solutions. Through this research we found that there are a variety of offerings suited to different client needs, giving you a good landscape to choose from. There are essentially five solution types emerging: 1) enterprise systems management vendors; 2) OS/hypervisor vendors; 3) converged infrastructure solutions; 4) pure-play cloud solutions; and 5) grid-derived solutions. Each brings the core IaaS features as well as unique differentiating value.

How should you choose which one is right for you? That very much depends on which vendors you already have relationships with, what type of cloud you want to deploy, where you want to start from, and what you hope to get out of the cloud once it's deployed.

A lot of enterprise I&O shops told us they want their private cloud to use as many of the same tools they already use in their virtualization and system management portfolio today. This makes sense if you want to link these environments together. Others find it easier and more applicable to be as similar to their internal HPC environment as possible.

Still others said they want to fully isolate their private cloud from the rest of their traditional deployments so they ensure they get cloud right and satisfy their developers that are currently going around IT to use public clouds. For them prioritizing developer engagement trumps legacy integration.

Whatever your bent, there’s a solution that’s likely right for you either on the market today or on its way to market. We included 15 vendors in our analysis, limiting the selection to solutions in market as of last quarter and with existing enterprise customer references they could provide. Another doubling of this number will be in market by late this year.

As was also noted in this research and prior reports, these are early days for private clouds – both for enterprises and vendors. So it’s best for I&O pros to view private cloud as a work in progress. One key takeaway must be considered when crafting your private cloud plan: Your virtualization environment is not your cloud. There’s a stark difference in that clouds are meant to be much more standardized, automated, and shared than traditional deployments and you must enable self-service and metering of use.

Our report attempts to lay out the landscape of private cloud solutions and provide a series of criteria I&O pros told us matter and thus can be used to determine which types of vendors to shortlist. Despite some media reports drawing their own conclusions from our research, the market overview does not rank the participating vendors. In fact, doing so using our analysis would be illogical, as some of the criteria used distinguish one type of solution from another. For example, we examine whether a solution includes or can include hardware, or is purely software. Whether hardware is included has no bearing on how good a solution is; it simply distinguishes a type of solution. If you plan to build your private cloud atop your own hardware, you wouldn’t want to consider one that requires its own.

Another core part of our research was to provide a means of apples to apples comparison of the solutions. To achieve this we asked all vendors to conduct a demo, going through a series of compulsory steps. These videos were recorded, and with the permission of the participants, will be made available exclusively to Forrester Leadership Board clients in the FLB community in the coming weeks. This is yet another way Forrester provides greater transparency to our research so you can make better decisions that make you successful every day.

We hope you find this research valuable and encourage your participation and discussion about private cloud solutions in our member community.

Cloud Security and Governance

•• David Tesar posted a description of the Windows Azure Security Overview course from the Microsoft Virtual Academy on 6/3/2011:

Learn the essentials of Windows Azure Security by covering the security protection included at every layer. We cover the security mechanisms included with Windows Azure at the physical, network, host, application, and data layers. Furthermore, get a basic understanding of some of the identity options you have to authenticate to Windows Azure.

David Strom (@dstrom) explained How to secure your VMs in the cloud, Part 2 in a 6/3/2011 post to the ReadWriteCloud:

Our article earlier this week addressed some of the broad product categories and specific vendors that are in the market to provide VM protection for your cloud-based infrastructure. In this follow-up, we'll talk about some of the more important questions to ask your potential protection vendor as you consider these solutions.

What specific versions of hypervisors are protected? All of these products work with particular VMware hosts, some only work on more modern (v4 or newer) versions. Some, such as Catbird's vSecurity and BeyondTrust PowerBroker, also work with Xen hosts (and by extension, Amazon Web Services, which is built on top of Xen). None currently work with Microsoft HyperV technology.

Do you need agents and if so, where are they installed? What happens when you add a new ESX host to your data center to get it protected by each product? Each product has a different process by which its protection gets activated; some (such as Hytrust and Reflex) are easier than others that require multiple configuration steps or a series of different agents to be added to each host. Some products install agents on the hypervisor itself, so no additional software is needed inside each VM running on that hypervisor. Others work with the VMware interfaces directly and don't need any additional software. Some require VMware's vMA or vShield add-ons. The goal here is to provide instant-on protection, because many times VMs can be paused and restarted, avoiding the traditional boot-up checks that physical security products use.

Can I email reports to management and can they make actionable decisions from them? A security manager wants to understand where and how they are vulnerable, and be able to clearly explain these issues to management too. Some products produce reports that could be phone books if they were printed out: this level of detail is mind numbing and not very useful or actionable. Others do a better job of presenting dashboards or summaries that even your manager can understand. I liked the reports from Trend: they were easiest to produce and parse, and share with management. Setting up reports for Beyond Trust was excruciatingly complex.

[Trend Dashboard]: Trend Micro's Deep Security has a very actionable dashboard with alert summaries and event histories.

How granular are its policy controls? Another item to examine is how easy it is to add elements to existing policies or create entirely new ones. This is the bread and butter of these products; but be aware of how they create and modify their policies because this is where you end up spending most of your time initially in setting things up.

Finally, what is the price? Each product has a complex pricing scheme: some charge by VM, by virtual socket, by protected host, or by physical appliance. Make sure you understand what the anticipated bill will be with your current cloud formation and what you expect to be running in the future. For example, Catbird charges $2000 per VM instance, while Hytrust charges $1000 per protected ESX host.

No significant articles today.

Cloud Computing Events

• The Windows Azure Team UK is sponsoring The Cloud Hack (#cloudhack) to be held at the Vibe Bar, The Truman Brewery, 91 Brick Lane, London, E1 6QL on 6/11/2011:

What APIs will be available?

With Huddle, you can manage projects, share files and collaborate with people inside and outside of your company, securely. It is available online, on mobile devices, on the desktop, via Microsoft Office applications, major business social networks and in multiple languages.

Paypal have agreed to join The Cloud Hack's API line-up offering a variety of payment based APIs for you to work with.

National Rail Enquiries is the definitive source of information for all passenger rail services on the National Rail network in England, Wales and Scotland.

Bing Maps Platform is a geospatial mapping platform produced by Microsoft. It allows developers to create applications that layer location-relevant data on top of licensed map imagery. With Bing's powerful APIs, developers can create location based applications and services with ease.

What’s happening?

The day will be a social affair with food, drink, challenges and prizes to be had. You will be hacking together apps and widgets using Windows Azure, so make sure you are prepared! We’ll be sending through a welcome pack to the event 2 weeks before which will include your unique Windows Azure login, details of participating APIs and a preview of the challenges you will be facing. Be sure to get yourself familiar with everything prior to the event to give you the very best chance….

What should I bring?

Bring your weapon of choice - be it a laptop, tablet, smartphone, or anything else that can code. Chances are there will be people with extras, but make sure you have something to start with.

How do I tell people what I've done?

Blog, tweet, post, comment. Tell the world! We’ll be using the #cloudhack hashtag on the day.

What can I expect?

You will have the opportunity to access some exclusive APIs from some leading brands as well as getting down and dirty with Microsoft’s new cloud platform. With a chance to win up to £1,000+ on the day as well as other prizes it will surely be worth your while, and if you don’t win, there’s always the free bar.

11.06.11 Vibe Bar

The Truman Brewery 91 Brick Lane, London, E1 6QL

Roger Struckhoff (@struckhoff) announced “Microsoft's Brian Prince Outlines Strategy at Cloud Expo” in his Microsoft Leverages Cloud Storage for Massive Scale post of 6/3/2011 to the Cloud Computing Journal:

Three of the most important words when it comes to enterprise IT & Cloud Computing are storage, storage, and storage.

Microsoft Architect Evangelist Bring Prince will tackle this subject during his Cloud Expo session, entitled "Leveraging Cloud Storage for Massive Scale."

"Many organizations are a little shy when it comes to adopting the cloud, for several reasons,," Brian says. "Most are slowly adopting the cloud by using a hybrid architecture strategy. Couple this strategy with the need to increase scale and speed on your web application, without taking the big cloud jump."

Brian says his session "will look at how to achieve this with a low cost and low risk strategy of keeping your app where it is, but leveraging cloud storage and caching to achieve great scale and performance." He alleges that he gets "super excited whenever I talk about technology, especially cloud computing, patterns, and practices. That's a good thing, given that my job is to help customers strategically leverage Microsoft technologies and take their architecture to new heights."

Brian is also co-founder of the non-profit organization CodeMash, runs the global Windows Azure Boot Camp program, and is co-author of the book Azure in Action, published by Manning Press.

Other Cloud Computing Platforms and Services

••• Tim Negris (@TimNegris) warned “Don't Believe the Hype” and then asked if Hadoop is the Answer! What is the Question? in a 6/4/2011 post to the Cloud Computing Journal:

Disclosure: In addition to being a Sys-Con contributor, I am the VP of Marketing at 1010data, a provider of a cloud-based Big Data analytics platform that provides direct, interactive analytical access to large amounts of raw structured and semi-structured data for quantitative analysts.

So, 1010data doesn't have much actual overlap with Hadoop, which provides programmatic batch job access to linked content files for text, log and social graph data analytics. I must confess, I have not been paying much attention to Hadoop.

But, while doing research for my upcoming presentation on Cloud-based Big Data Analytics at Cloud Expo in NYC (3:00 on Thursday), I uncovered an apocrypha in the making, a rich mythology about a yellow elephant whose name seems to have become the answer to every question about Big Data. Got a boatload of data? Store it in Hadoop. Want to search and analyze that data? Do it with Hadoop. Want to invest in a technology company? If it works with Hadoop, get out the checkbook and get in line.

And then, I was on a Big Data panel at the Cowen 39^th Annual Technology, Media and Telecom Conference this week and several of my fellow panelists were from companies that in one way or another had something to do with Hadoop.

So, as a public service to prospective message victims of the Hadoop hype, I decided to try to figure out what Hadoop really is and what it is really good for. No technology gets so popular so quickly unless it is good for something, and Hadoop is no exception. But Hadoop is not the solution to every Big Data problem. Nothing is. Hadoop is a low-level technology that must be programmed to be useful for anything.

It is a relatively immature (V0.20.x) Apache open source project that has spawned a number of related projects and a growing number of applications and systems built on top of the crowd-sourced Hadoop code. I have discovered that many people say "Hadoop" when they really mean Hadoop plus things that run on or with it. For instance, "Hadoop is an analytical database" means Hadoop plus Hive plus Pig. The ever-lengthening "Powered By" list is here.

Despite their general enthusiasm for the framework, though, many Hadoop developers also stress the difficulty of programming applications for it, including Rick Wesel, the developer of the Cascading MapReduce library and API, who writes on his blog,

The one thing Hadoop does not help with is providing a simple means to develop real world applications. Hadoop works in terms of MapReduce jobs. But real work consists of many, if not dozens, of MapReduce jobs chained together, working in parallel and serially.

MapReduce is a patented software framework developed by Google and underlying Hadoop. Its Wikipedia enry describes the two parts like this:

"Map" step: The master node takes the input, partitions it up into smaller sub-problems, and distributes those to worker nodes. A worker node may do this again in turn, leading to a multi-level tree structure. The worker node processes that smaller problem, and passes the answer back to its master node.

"Reduce" step: The master node then takes the answers to all the sub-problems and combines them in some way to get the output - the answer to the problem it was originally trying to solve.

So what is Hadoop? Straight from the elephant's mouth,

Apache Hadoop is a framework for running applications on large cluster built of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named Map/Reduce, where the application is divided into many small fragments of work, each of which may be executed or reexecuted on any node in the cluster. In addition, it provides a distributed file system (HDFS) that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both Map/Reduce and the distributed file system are designed so that node failures are automatically handled by the framework.

Said more simply, Hadoop lets you chop up large amounts of data and processing so as to spread it out over a dedicated cluster of commodity server machines, providing high scalability, fault tolerance and efficiency in processing operations on large quantities of unstructured data (text and web content) and semi-structured data (log records, social graphs, etc.) In as much as a computer exists to process data, Hadoop in effect turns lots of cheap little computers into one big computer that is especially good for analyzing indexed text.

By far Hadoop's most generally interesting and newsworthy triumph to date has been helping IBM's Watson supercomputer beat the best humans on Jeopardy. That role is dissected here.

Aside from winning game shows, though, what is Hadoop good for? Speaking of the Big Data biggie, IBM, here is Big Blue's answer to that question by way of a pithy Judith Hurwitz tweet:

But, Hadoop is early - not yet at Version 1! - open source code created and edited by many different pro bono programmers, without a commercial binding of business process, coding disciplines, or direct market dynamics. In other words, it is what it is and some of the functions that are hard or tedious to code, even if nonetheless badly needed, go wanting. (Read tales of "zombie tasks" and other terrors from the "Dark Side of Hadoop" here.)

In any case, though, Hadoop is very versatile and many smart people and companies have found an amazing variety of uses to put it to. And it is always fun to watch the tech world wind itself up around a new topic. Big Data is the new black and Hadoop is the "it" elephant.

But it isn't good for everything. See http://wiki.apache.org/hadoop/HadoopIsNot or read Ricky Ho's excellent blog post, which shows how Hadoop's design makes it a poor choice for things like fast, interactive, ad hoc analysis of large amounts of frequently updated structured (transactional) data, as for, say, all the daily trades in a busy stock exchange or large retail chain.

As Ho explains it, Hadoop spreads data out in file chunks on a number of computers and it breaks programming down into many small tasks, also spread across those machines and run in parallel as a batch job. While a job is running, the data it is working on cannot be updated and, because the processes must communicate with each other and they are spread out across multiple networked computers, there is considerable network-related latency in the execution of the job.

Hadoop grew out of work done by both Yahoo and Google, which betrays its essence of purpose: gathering, storing and indexing, vast amounts of chunks of text and semi-structured data, understanding the relationships between those chunks, and finding them quickly when needed. So, it is not surprising that the most impressive uses of Hadoop we have seen are in the area of analyzing so called "social data".

That's the voluminous accumulation of comments, web pages, and tweets, the identities, locations, relationships and other attributes associated with the people, sites, things and processes referenced in that data. There is much to be learned from such data. But when there is a lot of it, just putting it somewhere and searching and analyzing it efficiently across multiple computers and disks is difficult and Hadoop and many of its best applications are built for making that easier.

But, there are numerous products now layered on top of Hadoop that make it function as a tabular relational database and other forms of storage. This enables customers to reuse SQL code they have already developed and to develop new query code in a language they know. And it enables Hadoop to go pilot fish or Trojan horse on Oracle and MySQL. But, using SQL as an access language and materializing data in unordered, joined, indexed tables does not play to Hadoop's natural strengths.

Hive is a Hadoop project that relationalizes Hadoop for data warehousing and analytics and here is what one apparently experienced crowdsourcer said about it on the Stack Overflow site.

Hive is based on Hadoop which is a batch processing system. Accordingly, this system does not and cannot promise low latencies on queries. The paradigm here is strictly of submitting jobs and being notified when the jobs are completed as opposed to real time queries. As a result it should not be compared with systems like Oracle where analysis is done on a significantly smaller amount of data but the analysis proceeds much more iteratively with the response times between iterations being less than a few minutes. For Hive queries response times for even the smallest jobs can be of the order of 5-10 minutes and for larger jobs this may even run into hours.

Cutting through the Hadoop hype, if you are looking to query, report or analyze large amounts of unstructured or semi-structured data or you need to build a scalable SQL data warehouse and in either case you don't mind latency, and batch processing is an acceptable model for your situation, Hadoop and many of its many adjuncts may solve your problem.

But, if you need to interact directly with large amounts of raw tabular data and many kinds of semi-structured data for iterative, collaborative analytics, Hadoop is not what you are looking for. If you want to do it in a managed cloud, you could look at 1010data, on dedicated hardware, check Teradata or Netezza, with software only on commodity hardware, Greenplum and Vertica might be worth a look.

••• Simon Munro (@simonmunro) asked Is Amazon RDS with Oracle any good? in a 6/2/2011 post to Cloud Comments.net:

We know that Oracle databases power enterprises. An RDBMS market share of 41% means that a lot of data is gettting stored in and processed by Oracle databases. I have also seen my fair share of Oracle biased data centres to know that these databases run on really big tin. Multiprocessor RAM loaded beasts with sophisticated storage and fancy networking to backup data across data centres on the fly is the nature of real world Oracle installations. And these installations support massive ERP systems upon which enterprises stand and are at the core of finance and trading.

So what then of Amazon Oracle RDS? Does this mean that all of the great things that the enterprise gets out of Oracle can be provisioned on demand via a simple web console? Can we get the kind of performance that enterprises are familiar with out of an RDS ‘High-Memory Quadruple Extra Large DB Instance’? A 64GB, 8 core, ‘High I/O Capacity’ machine should be able to do a lot of work.

I’m not so sure.

I have some hands on Oracle experience, but only from a developer perspective. I never did break into the Oracle DBA ~~cabal~~ society and they never taught me their secret handshake. But one thing I do know is that those hardcore Oracle DBAs know some serious shit. They drill down to a really low level, look at mountains of monitoring data, talk to each other with secret acronyms and are able to configure, build, tune and beat into submission their Oracle databases to get the most out of them.

I would love to know what they think about Oracle RDS. Does it give them the power to use Oracle properly? Does the database engine abstract things like IO tuning away enough for it not to matter? Or is the underlying Oracle RDS configuration a ‘good enough’ configuration that works in a lot of cases but is suboptimal when the going gets tough.

The Oracle feature set on RDS is incomplete, but I’m not qualified to say if that is such a big deal or not. Although the lack of RAC is a clue that it’s not for really big databases.

Oracle database features not currently supported include:

Real Application Clusters (RAC)

Data Guard / Active Data Guard

Oracle Enterprise Manager

Automated Storage Management

Data Pump

Streams

It also seems that there are some security restrictions too – which is expected on multi-tenant infrastructure. But are the restrictions too much for real Oracle DBAs?

Again with AWS we are stuck with a lack of detailed documentation so it will be difficult to understand what the limitations are. If I had to ask a serious Oracle DBA to help me with a database where he or she was given no information about the underlying network and storage architecture, I’m sure I wouldn’t get that much help. So without detailed documentation it is left up to a first mover to put the effort in to see how well it stands up, and I’m glad it won’t be me.

With MySQL I can understand. MySQL was always the ginger child of the database world and has never been taken seriously in high load database environments. So it is easy to use RDS MySQL, it can’t be that bad.

We’ll have to wait and see what happens with Oracle. I’d be very interested to see what the hardcore DBAs think of Oracle on RDS. Not the evil plans and bring your own licensing issues, but the real world performance and ability to process and store data.

Until then, I think most architects will hold off recommending RDS Oracle as a core part of any solution. For now it will simply be safer to keep your big Oracle databases on premise. At the very least it will keep your DBA happy and lets be honest, you never want to piss off your Oracle DBAs.

I highly recommend articles by the Cloud Comments.net blog’s authors, Simon Munro, Grace Mollison and James Saull, because of their technical orientation. Subscribed.

••• Simon Munro (@simonmunro) asserted Dryad Cannot Compete With Hadoop in a 12/21/2010 post to the Cloud Comments.net blog:

Dryad, Microsoft’s MapReduce implementation, has finally found it’s way out of Microsoft Research and is now open to a public beta. Surprisingly it is a quiet blog announcement, with no accompanying name change that is typical of Microsoft such as ‘Windows Server 2008 R2 High Performance Distributed Compute Services’ – or something equally catchy. Unsurprisingly for an enterprise software vendor, Dryad is limited to Windows HPC (High Performance Compute) Server – which means high-end tin and expensive licences. While most of the MapReduce secret sauce for the Dryad implementation probably comes from the HPC OS, it is disappointing that there is still no generally available MapReduce on the Microsoft stack on the horizon.

I recall more that a year ago wishing for Dryad (and DryadLINQ) on Azure to query Azure Table Storage (the Azure NoSQL data store) and generally thinking that Azure would get some MapReduce functionality as a service (like Amazon Elastic MapReduce – a Hadoop implementation on AWS) out of the Dryad project. But it seems that Microsoft is focussing on the enterprise MapReduce market for now.

I’m not that sure about the market for enterprise MapReduce and defer to the experts, wherever they may be. I thought that MapReduce was about using cheap resources (commodity hardware and open source licences) in order to scale out compute power. Surely if you have to pay Microsoft and high end hardware tax the case for scale out drops and you are better off just scaling up already expensive servers? I am sure the research market will still go for Hadoop or similar, and maybe Microsoft sees something in the deep pockets of financial services.

Had Microsoft brought Dryad to Azure, as ‘MapReduce as a Service’ then there would be something worth looking into on the Microsoft stack, but until then MapReduce for the masses is likely to remain on Hadoop. Hadoop is the de facto MapReduce implementation for non-specialists and as MapReduce gains traction as a way of solving problems, Microsoft will find it impossible to catch up.

I missed this post because I was following http://simonmunro.com/, not http://cloudcomments.net/. That’s fixed here.

•• Raghav Sharma continued his comparison series with Amazon AWS vs. Microsoft Azure Part 2 in a 6/4/2011 post to the CloudTimes blog:

In part 1 of this article (http://cloudtimes.org/amazon-aws-vs-microsoft-azure-part-1/) we initiated a comparison of the services provided by Amazon AWS and Microsoft Azure. We talked about a bit of the history of the two platforms and found certain pros and cons of each of them.

Further in this part 2 of the series, we’d focus on certain tangible and rather objectively measurable comparison points of the two services. As we discussed already, the famous “free tier” was one of the major proponent in popularity of AWS. As part of their innovation promotion package, AWS still runs that free tier. Similarly, Microsoft has also created a free tier for their services. Lets take a look at both the providers’ free tiers.

AWS (quoted from AWS site → http://aws.amazon.com/free/)

750 hours of Amazon EC2 Linux Micro Instance usage (613 MB of memory and 32-bit and 64-bit platform support) – enough hours to run continuously each month*

750 hours of an Elastic Load Balancer plus 15 GB data processing*

10 GB of Amazon Elastic Block Storage, plus 1 million I/Os, 1 GB of snapshot storage, 10,000 snapshot Get Requests and 1,000 snapshot Put Requests*

5 GB of Amazon S3 standard storage, 20,000 Get Requests, and 2,000 Put Requests*

30 GB per of internet data transfer (15 GB of data transfer “in” and 15 GB of data transfer “out” across all services except Amazon CloudFront)*

25 Amazon SimpleDB Machine Hours and 1 GB of Storage**

100,000 Requests of Amazon Simple Queue Service**

100,000 Requests, 100,000 HTTP notifications and 1,000 email notifications for Amazon Simple Notification Service**

10 Amazon Cloudwatch metrics, 10 alarms, and 1,000,000 API requests**

Azure (quoted from Azure official site → http://www.microsoft.com/windowsazure/free-trial/)

750 hours of an Extra Small Compute Instance, which caps out at 1.0 GHz and 768MB of memory and 25 hours of a Small Compute Instance.

20GB of cloud storage with 50k Storage transactions.

Data transfers: 20GB in / 20GB out.

90 days of access to the 1G Web Edition SQL Azure relational database.

[Putting] the jargon aside, lets put these two offerings in perspective -

750 hrs a month, easily surpasses the 24×7 needs (24×31 = 744 hrs a month) , which basically says that you can have your application up and running on the cloud for free. Both Azure and AWS allow that.

One contrasting point here though, AWS provides a Linux instance, and NO windows instance, Azure provides the Windows instance. So, if you application is on Windows, hard luck with AWS, but Azure can help you still.

The traffic for application is the next most important thing, AWS allows 15GB two way (in and out, put together 30GB), whereas, Azure takes it one step ahead and provides 20GB two way (in and out, put together 40GB). AWS provides a value add in the form of Elastic Load Balancer, which Azure doesn’t really mention.

The storage, another very important parameter for an application, is hosted on famous EBS (Elastic Block Storage) by AWS and is capped at 10GB. In addition, 5 GB of S3 storage is bundled in the offer. Azure provides 20GB of storage (since they have mentioned explicitly, I assume that it includes all three types, table, blob and queue).

One feather here for Azure, they have an RDBMS on the cloud, SQL Azure. They bundle 90 days worth of usage of SQL Azure web edition capped at 1GB. Enough for a developer to get his idea machine working.

All in all, a decent launch-pad package by both AWS and Azure. There is one distinction though, the AWS free tier has 1 year from the sign up date, basically it allows a business application to grow for the first year on the system free of cost (within free tier limits) and then starts charging at pay as your go model. Azure free tier, however doesn’t do that, and it limited till September 2011 only. From that perspective, AWS moves ahead quite clearly.

Next ideal point of comparison would be the instance sizes and the prices attached to them. In our study, we found that the instance sizes are generally segmented pretty closely, and there is not really a huge difference. The terms referred by each of the providers might differ, however the configurations are comparable.

The table attached here, provides a quick insight into the various instances offered by each of them, and the pricing. Since Azure only provides Windows instances, we have taken windows pricing for AWS as well. On average, AWS Linux instances are priced 25%-30% lower than the corresponding windows instances.

Besides the standard instances, AWS also provides certain customized instances as well. There are three categories, High Memory, High CPU and Clustered. All the three are identifiable by their names as such. This kind of offering is mostly seen as an indicator of the maturity of the provider. Azure doesn’t provide any such service at the moment.

AWS & Azure Compute Instance Comparison

Compute Instance Size CPU Memory Instance Storage I/O Performance 32/64 bit Remarks Hourly rate

Azure Extra Small 1 GHz 768 MB 20 GB* Low $0.05

Small 1.6 GHz 1.75 GB 225 GB Moderate $0.12

Medium 2 x 1.6 GHz 3.5 GB 490 GB High $0.24

Large 4 x 1.6 GHz 7 GB 1,000 GB High $0.48

Extra large 8 x 1.6 GHz 14 GB 2,040 GB High $0.96

AWS Micro 2 x 1.1 GHz 613 MB EBS only Low 32/64 bit shared instance $0.03

Small 1 x 1.1 Ghz 1.7GB 160GB Moderate 32 bit $0.12

Large 4 x 1.1 Ghz 7.5 GB 850GB High 64 bit $0.48

Extra large 8 x 1.1 GHz 15GB 1690 GB High 64 bit $0.96

	Compute Instance Size	CPU	Memory	Instance Storage	I/O Performance	32/64 bit	Remarks	Hourly rate
Azure	Extra Small	1 GHz	768 MB	20 GB*	Low			$0.05
Small	1.6 GHz	1.75 GB	225 GB	Moderate			$0.12
Medium	2 x 1.6 GHz	3.5 GB	490 GB	High			$0.24
Large	4 x 1.6 GHz	7 GB	1,000 GB	High			$0.48
Extra large	8 x 1.6 GHz	14 GB	2,040 GB	High			$0.96
AWS	Micro	2 x 1.1 GHz	613 MB	EBS only	Low	32/64 bit	shared instance	$0.03
Small	1 x 1.1 Ghz	1.7GB	160GB	Moderate	32 bit		$0.12
Large	4 x 1.1 Ghz	7.5 GB	850GB	High	64 bit		$0.48
Extra large	8 x 1.1 GHz	15GB	1690 GB	High	64 bit		$0.96

Note: A load balancer isn’t applicable to a single cloud computing instance. Raghav also fails to distinguish Azure, a Platform as a Service (PaaS) offering, from AWS, Infrastructure as a Service (IaaS).

•• Alex Popescu (@al3xandru) reported MongoHQ Announces MongoDB Replica Set Support in a 6/3/2011 post to his myNoSQL blog:

MongoHQ Announces MongoDB Replica Set Support:

MongoHQ, a MongoDB hosting solution:

Now, we are excited to announce that we offer high-availability multi-node replica set plans on our MongoHQ platform. We are pretty excited about this release, as it makes available some great features that our users have been asking for, including:

High availability databases with automatic failover

Nodes located in multiple availability zones

Dedicated volumes to maximize read/write performance

Slave nodes that can be used as read-slaves for enhanced read throughput

If scaling MongoDB with replica sets and auto-shar[d]ing is so easy, I’m wondering why it took MongoHQ so long to add support only for replica sets. [Emphasis added.]

MongoHQ offers shared MongoDB databases running on AmazonEC2 instances for prices ranging from Free (16 MB) to Large: US$49/month (30 GB). High-availability Replica Sets offer 20 GB storage for $300/month; storage up to 100 GB costs $1.50/GB-month. I’m wondering why folks are clamoring for larger SQL Azure databases when a shared MongoDB service tops out at 100 GB max storage. If the 100GB limit applies to replicas and primaries have two secondary replicas (as does SQL Azure), 100 GB storage = 33 GB of stored data and indexes.

•• Jnan Dash (@JnanDash) posted The Gang of Four (sans Microsoft) on 6/1/2011:

Yesterday at the Wall Street Journal’s “All Things Digital (D9) conference” in Rancho Palos Verdes, Eric Schmidt, the ex-CEO of Google was the first to be interviewed by Walt Mossberg and Kara Swisher. He made some interesting points which made it clear that Facebook is Google’s number 1 competition now. He admitted that he made a mistake by not taking the Facebook threat seriously four years ago.

He talked about the “Gang of Four” meaning – Google, Facebook, Amazon, and Apple (hey, no mention of Microsoft). These four have common characteristics in that they are all exploring platform strategies, and they all focus on a consumer brand with aggressive scaling and globalization as key themes. The unique part of Facebook is their hold over the consumer “identity” by connecting to friends and relatives.

Eric acknowledged that he and other executives failed to take Facebook seriously four years ago when the social networking site had around 20m active users. Today, with more than 500 million users and growing, Facebook has become a magnet for online advertising, and continues to stunt Google’s financial growth.

Mr Schmidt said that Google, with co-founder Larry Page now at the helm, is pushing to develop more ways to connect people with their friends and family. “I think the industry as a whole would benefit from an alternative [to Facebook],” Mr Schmidt said.

He added that attempts by Google to negotiate a partnership with Facebook were repeatedly turned down, with the networking site preferring to partner up with rival Microsoft, which owns a 1.6pc stake in the company. Google also has ties to Facebook. One of its former executives, Sheryl Sandberg, is Facebook’s chief operating officer.

Facebook poses another problem to Google, as much of the information on Facebook’s website cannot be indexed by Google’s search engine. This restriction threatens to make Google less useful as more people form social circles online which could make it more difficult for it to understand a user’s personal preferences, which benefits advertisers.

Apple’s platform tends to be more proprietary, but it has built a huge franchise of developers for its iPad and iPhone applications. Google’s Android is much more open and is rapidly building a huge developer community for tablet applications. Amazon pioneered the cloud computing infrastructure and hence provides the elaborate AWS (Amazon Web Services) platform for [i]ts infrastructure.

The lack of mention of Microsoft in the Gang of Four is interesting, as it lags the consumer brand of internet and seems to move more towards enterprise computing. I am sure the leaders at Microsoft would disagree with this characterization.

Windows 8 appears to me to be targeted at consumers, rather than enterprises. The features disclosed so far offer little incentive for enterprises to upgrade to Windows 8 from Windows 7, or Vista.

• Klint Finley (@Klintron) reported How Linux 3.0 Makes Virtualization Easier in a 6/3/2011 post to the ReadWriteCloud:

We told you earlier this week about the Linux 3.0 release candidate, but here's another point of interest: as of the new version, as stated by Wim Coekaert, "every single bit of support needed in Linux to work perfectly well with Xen is -in- the mainline kernel tree."

Up until now a few patches have been required to the Linux kernel to make Xen hypervisors and virtual machines work. Now those features built right in. Plus, there's a new kernel mode for dealing with virtualization.

Coekaert explained in more detail in a blog post:

Xen has always used Linux as the management OS (Dom0) on top of the hypervisor itself, to do the device management and control of the virtual machines running on top of Xen. And for many years, next to the hypervisor, there was a substantial linux kernel patch that had to be applied on top of a linux kernel to transform into this "Dom0". This code had to constantly be kept in sync with the progress Linux itself was making and as such caused a substantial amount of extra work that had to be done.

The Xen team has been working for years to get everything needed to run the Linux kernel as Dom0. Some of the components were added last year, but the final elements have now been added to handle everything.

Another addition to the kernel that will make the mainline Linux kernel more Xen friendly is pvops, a mode which will enable the kernel to switch between paravirtualization (pv), hardware virtualization (hvm) or paravirtual-hardware virtualization (pv-hvm).

One thing this this means, according to a post by Ewan Mellor
is that Linux distributions will no longer have to package different kernels for Xen support, making Xen support much easier.

It would be interesting to know how these changes affect Linux 3.0 support by Hyper-V.

Friendly name	Windows Azure Guest OS 1.13 (Release 201104-01)
Configuration value	WA-GUEST-OS-1.13_201104-01
Release date	June 3, 2011
Features	Stability and security patch fixes applicable to Windows Azure OS.

Bulletin ID	Parent KB	Vulnerability Description
MS11-018	2497640	Cumulative Security Update for Internet Explorer
MS11-019	2511455	Vulnerabilities in SMB Client Could Allow Remote Code Execution
MS11-020	2504829	Vulnerability in SMB Server Could Allow Remote Code Execution
MS11-026	2503658	Vulnerability in MHTML Could Allow Information Disclosure
MS11-027	2508272	Cumulative Security Update for ActiveX Kill Bits
MS11-028	2484015	Vulnerability in .NET Framework Could Allow Remote Code Execution
MS11-029	2489979	Vulnerability in GDI+ Could Allow Remote Code Execution
MS11-030	2509953	Vulnerability in DNS Resolution Could Allow Remote Code Execution
MS11-031	2510587	Vulnerability in JScript and VBScript Scripting Engines Could Allow Remote Code Execution
MS11-032	2507818	Vulnerability in the OpenType Compact Font Format (CFF) Driver Could Allow Remote Code Execution
MS11-034	2506223	Vulnerabilities in Windows Kernel-Mode Drivers Could Allow Elevation of Privilege
	2524375	Fraudulent Digital Certificates could allow spoofing

Sunday, June 05, 2011

Preparing the Storage Container

Platform Paradigms

Windows Phone 7 and Silverlight

Android Activity

Creating the UIs

Description

Overview

Objectives

System Requirements

Setup

Exercises

Starting Materials

Read more: next >

OData PowerShell Explorer

Other Resources

The Problem

Step 1: Look for inspiration

Step 2: Translate from source material to problem domain

Step 3: Did you lose something important?

Step 4: Rinse and Repeat

Conclusion

Windows Azure AppFabric SDK and using Microsoft.ServiceBus.dll

Creating a new Service Bus Namespace

The code

Simple but effective logging

The output

Scenario

Solution

Supporting Materials

Related Resources

FIRST THINGS FIRST

F5 and IPv6

F5 BIG-IP supports IPv6 but more importantly its IPv6 Gateway Module supports efforts to present an IPv6 interface to the public-facing world while maintaining existing IPv4 based infrastructure.

Learn More

Additional references

Combine Web and Worker Roles

Extra Small Instance for VM Sizes

Host a Website Purely in Windows Azure Storage

Lease Blobs

Multiple Websites in a Web Role

Role Startup Tasks

Run Cool Stuff

Securely Connect On-Premise Servers to Roles

Manage Traffic Between Datacentres

CNAME Everything

Summary

License

Cloud marketing tactics

The problem with IT is that it’s slow

Once you speed up IT, you make more money

What about cost?

Building trust so you can get to the boring stuff

Everything else

LightSwitch, Because Your Business Data Is Important

Eager Loading in a Single Database Trip

Identifying The Best Partner Choices For ISVs And Service Providers

TABLE OF CONTENTS

Features

What APIs will be available?

What’s happening?

What should I bring?

How do I tell people what I've done?

What can I expect?

11.06.11 Vibe Bar

AWS & Azure Compute Instance Comparison

0 comments:

Forbes: Who Are The Top 20 Influencers in Big Data?

TRAACKR: Who Are The Top 50 Data Science Influencers?

Blog Archive

OakLeaf Blog Curations on Curah!

Links to SQL Azure Labs and Other Big Data Articles

Windows Azure Mobile Services Preview Walkthrough for Windows Store Apps

OakLeaf's New Windows Azure WebSites

Check Out OakLeaf's New Azure DataMarket Offerings

SearchCloudComputing Articles

Articles for Red Gate Software's ACloudyPlace

Windows Azure Articles for Developer.com

DZone Syndication

Feeds