This article presents the view that a Service Oriented App (SO App) is a system of collaborating microservices. These are mainly rather small, finely grained internal services. This realization is key in the current interest in microservices. And services internal to the system may also connect with external services as necessary.
As a result, an SO App is not a monolithic structure as is often the case when using a pure object oriented design approach to define software structure. Rather the structure of an SO App is that of several of not-so-large services collaborating with the UI, with data storage, with each other, and with external services to provide the behavior required. This makes SO Apps more readily scalable and more loosely coupled than traditional monolithic object oriented apps. Discrete services – at times separately hosted – greatly facilitate loose coupling and scalability.
Stepping back, one can view the current interest in microservices in part as a result of more and more people applying the things we’ve known for years that make very big differences in the cost, maintainability, scalability, and testability of software systems – The best practices of separation of concerns, abstraction, tight cohesion, loose coupling, and encapsulation.
Seeing code that implements abstract concepts like these best practices often aids in understanding both those concepts, and ways of applying them. To that end, a code example is presented to aid in understanding the idea of an SO App as a “system of collaborating microservices”. You can run the code example by following the Setup Instructions that are in the Visual Studio solution directory. Also in this directory is a key document defining basic terms used and the organization of the Visual Studio solution. Don’t miss reading this document.
This WcfNQueueSMEx2 code example is a refinement of the code example used in the prior article in this “SO Apps” series. The code example ignores vital cross cutting concerns like logging, security, and dependency injection to mention a few. This is done to keep the focus on the main topics of this series. Please do not implement your production apps exactly like this example.
Modifications were made to the WcfNQueueSMEx code example used in the first article to produce WcfNQueueSMEx2 as a system of collaborating microservices. Please refer to the following diagrams. Mind you, this is still a code skeleton. However with the below changes, it now has a few muscles and some connective tissue. Here are the main code changes made:
In the previously existing Data Feed Subsystem (services the ingestionqueue) 2 service components were added that collaborate with the Data Feed Manager to do validity checking and persistence:
- The Data Feed Manager service now orchestrates the validation of each dequeued ingested data item via a Validation Engine service. Invalid queued items are not stored.
- The Data Feed Manager service also now orchestrates the persistence of valid dequeued ingested data items to storage via an IngestedData Data Accessor service. And it does this in a way that is decoupled from the physical storage medium with a repository interface and its implementation. Here, note that the IngestedData Data Accessor currently does not use a local database. Rather it relies on an external service for persistence – Azure PaaS Table Storage via the repository, thus collaborating with an external service in the cloud.
In the new Admin Subsystem:
- A second subsystem fronted by the Admin Manager was added to support the IFeedAdmin service contract. This service contract provides its clients (mainly UI clients) with service operations required for administering the Feed subsystem. Only one such operation is implemented now — Present Feed Component Info — used to report information about the ingestionqueue, e.g. queue length. In the future the Admin Manager will also support other system wide administrative behavior via other service contracts (facets) beyond those associated with data feeds.
- Again, this subsystem is an example of collaborating services. And, please note the reuse of microservice components across subsystems: Several subsystems use separate instances of the AdminDA and Validation Engine services.
In the new “Some Subsystem”:
- A third subsystem was added to further demonstrate how microservice components can be reused across subsystems, and how SO Apps can be composed of multiple separate subsystems that can collaborate in various ways. Note that the collaboration between the Data Feed Subsystem and the new Some Subsystem is currently only via using a shared database – the IngestedData DB (which is accessed via an external service). Such collaboration, however, could be more dynamic involving queues or pub/subs providing Manager-to-Manager communication. Such dynamic collaboration is not presented in this example. Perhaps a subsequent article will feature this.
- This subsystem consists of:
- The Some Manager WCF service that receives requests from clients.
- A Some Data Analysis Engine service performing some sort of analysis on the previously ingested data.
- The reuse of the Validity Engine service (used by all subsystems) and the IngestedData Data Accessor service (used in the Data Feed subsystem) to access the IngestedData DB.
- This system does not currently access the IngestedData DB simply to reduce development time right now. The analysis results are simply strings inserted by a stub. This system is here only to provide a sketch of the structure of an SO App having multiple subsystems and to demonstrate microservice component reuse.
Below are diagrams of the call chains (the uses relation) of the 3 service operations (implemented by 3 Managers) in the WcfNQueueSMEx2 example. Except for the Clients, each box represents a service, and the arrows between boxes represent calls between services (or messages if you wish). These diagrams are not about classes or inheritance, but about services calling each other. In a real system each Manager would definitely have more than one service operation!
Figure 1 – Data Feed Subsystem’s call chain. DA is Data Accessor and DB is Database.
The Data Feed subsystem has one service operation and call chain implemented. Note that the dashed arrow above, going from the Data Source Client to the Data Feed Manager, indicates this is a queued call — it represents the ingestionqueue. All the solid arrows represent request/response calls.
As an aside, I have found this type of call chain diagram to be quite useful in designing systems, in communicating concepts to developers and between team members, and as a developer to aid me in more quickly implementing the services and clients required. I learned this technique at the week-long IDesign Architects Master Class. Then I got to practice it in the design of several real world systems under the supervision of a very experienced architect at the week-long IDesign Architecture Clinic. Constructing such diagrams does not take much time at all. And refactoring the drawings is vastly much faster than refactoring actual code.
Figure 2 – Admin Subsystem’s call chain. The Admin subsystem has one call chain for its single service operation. All of its calls are request/response.
Figure 3 – The new “Some Subsystem” call chain. It has one service operation and call chain implemented. Again, a real app would have more service operations here.
The WcfNQueueSMEx2 example system uses WCF to implement all of its collaborating internal services – Managers, Engines and Data Accessors alike. Managers are hosted in Windows Services (simulated by console apps in the code example) or in Azure Worker Roles. The Engine and Data Accessor services are run “in process“ (aka InProc) to the Managers that use them. This is accomplished by the IDesign InProcFactory (in the ServiceModelEx library), resulting in each Manager using its own private instance of the Engine and Data Accessor WCF services it requires. Performance tests have shown running services in-process like this will support 100 calls per second, and many more by doing performance optimization.
Alternately, one could have a Manager instantiate the Engine and Data Accessor classes it uses as objects rather than WCF services (or use dependency injection to supply instances of these classes) as is done in object oriented systems. However, running Engines and Data Accessors as WCF services in process to a Manager via the InProcFactory provides great value as follows:
- InProc WCF services are directly hooked into the “custom contexts” provided by WCF message headers. Thus an InProc Engine or Data Accessor can read from and write to the message headers shared by the Manager and other Engines and Data Accessors. The context these headers provide is a powerful way to separate the concerns of the system (i.e. the system processing the messages) from the concerns of the business domain (i.e. the domain oriented content of data contracts within the messages). This separation of concerns prevents polluting business-oriented data contracts with a bunch of system info required to effectively route and process messages. For example, the context can contain out-of-band parameters like user tokens, environmental info like whether the client is a mobile device or desktop system, message routing information, etc. These valuable kinds of information have no place in business domain oriented data contracts. Please see “Appendix B: Headers and Contexts” in PWS (cited below) for more information in this area.
- Running Engine and Data Accessor services InProc also provides them with the full power and extensibility of WCF as a message based communication framework. This includes automatic authentication of the caller if desired, transaction support, and data contract versioning tolerance to mention a few very useful capabilities. It also allows the services running InProc to easily “add tracing and logging, authorization, security audits, profiling and instrumentation, and durability” PWS, p 699. And it enables them to use the powerful extensibility mechanisms of WCF, plus the ability to implement aspect oriented behavior. All this can reduce development and maintenance costs, while supporting the implementation of very sophisticated behavior. There are numerous sections in PWS (cited below) that demonstrate how to exploit the power of WCF. For a comprehensive series of articles and code examples on WCF extensibility please see Carlos Figurea’s blogs at http://blogs.msdn.com/b/carlosfigueira/archive/2011/03/14/wcf-extensibility.aspx
In summary:
- Viewing and designing Service Oriented Apps as systems of collaborating microservices readily offers the benefits and reduced costs to be gained by applying the best practices of separation of concerns, abstraction, tight cohesion, loose coupling, and encapsulation.
- WCF as the implementation technology for Service Oriented Apps has the capabilities required to cost effectively build and maintain such systems designed with these best practices as their foundation.
- The code example shows that the code required to build such systems of collaborating microservices is not difficult at all. In fact I find it quite easy to write, test, and to refactor.
George Stevens
Bibliography
- PWS – Programming WCF Services, 3rd Edition, by Juval Lowy, Copyright 2010, O’Reilly Media, Sebastopol, CA.
dotnetsilverlightprism blog by George Stevens is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Based on a work at dotnetsilverlightprism.wordpress.com.
WcfNQueueSMEx is the starter app for my SO Apps series, which is sketched in “Introducing the ‘SO Apps’ Series of Blog Articles and Code Examples“.
WcfNQueueSMEx is a base upon which to build future examples. This small SO App fragment is a simple code skeleton whose function is to service data feeds – Ingest data into a system received via a queue from one or more data sources. It also demonstrates the use of a productivity enhancing library: IDesign’s ServiceModelEx for WCF, developed by Juval Lowy.
The goals of this article are:
- Provide you with an on-ramp to building WCF services using Service Bus Queues via the NetMessagingBinding.
- Demonstrate how a WCF service can be hosted as follows, with zero code changes to the WCF service code:
- In the cloud on Azure in a WorkerRole.
- In a Windows Service/Console App on-prem.
- In Microsoft’s new Service Fabric technology (or so it appears at the time of writing this).
- Demonstrate useful productivity boosters – Technologies and techniques that can reduce the amount of plumbing code you need to write, allowing a dev team to focus more on directly creating business logic (and hence value) instead of the working with the plumbing code connecting chunks of business logic in a code base.
WcfNQueueSMex uses a single Azure Service Bus queue to transmit data from one or more data sources to a WCF data feed service (data ingestion in IoT parlance) that services the queue via the NetMessagingBinding. The WCF data feed service is hosted in either a console app (simulating an on-prem service), or in an Azure WorkerRole in an Azure Cloud Service. The NetMessagingBinding and ServiceModelEx are used to reduce the amount of plumbing code needed to be written when servicing the Service Bus queue, increasing programmer productivity. A working code example is provided to give you running software you can experiment with.
Here is a tour of the projects in the solution:
- DataSourceSimulatorClient – A console app that enqueues messages to the Service Bus ingestionqueue with the DataFeedsClient proxy. The proxy uses the ServiceModelEx QueuedServiceBusClient and the NetMessagingBinding. Think of the DataSourceSimulatorClient as simulating an IoT device, or a process in a data center or in the cloud, that needs to periodically send chunks of data to a central processing service. There may be multiple instances of the DataSourceSimulatorClient running simultaneously, and they all place their data on the same ingestionqueue.
- DataFeedsManager – Hosted as described below, this is a WCF service listening to the ingestionqueue via the NetMessagingBinding, dequeueing messages from the queue. The DataFeedsManager acts in the role of only dealing with the data feed in various ways, including persisting the data. In subsequent versions processing the data will be added in a different manager (also a WCF service) to have a good separation of concerns between feeds and data processing, as recommended by the Single Responsibility Principle and the general approach that microservices use. The DataFeedsManager receives a queued item and in this version does nothing! In real life, the business logic in this manager will involve saving data to storage, perhaps doing some on-the-fly prep for subsequent analysis. It could also involve placing a message on a different queue to send it to another service for specialized processing. The example code for the DataFeedsManager now contains no such business logic or persistence code. Save that for a subsequent iteration. Now it is just a skeleton, displaying the contents of the message to a Console if available and also via Trace statements for running in Azure. The defaults used in this WCF service are InstanceContextMode.PerSession and SessionMode.Allowed. It’s called a “manager” rather than a “service” due to this example using the IDesign Method (TM) for the design. To learn more about this proven effective approach to designing Service Oriented Apps, please click the link.
- DataFeedsServiceHost – A console app that hosts the DataFeedsManager WCF service, simulating an on-prem Windows Service or IIS hosting. It uses the ServiceModelEx QueuedServiceBusHost.
- DataFeedsCloudWorker – Hosts the DataFeedsManager WCF service in an Azure Worker role in the cloud or in the Visual Studio Azure Emulator.
- DataFeedsCloudSvc – The Azure publishing package containing the DataFeedsCloudWorker.
- Shared – Contains items that are shared between the Client and Service, like the DataContracts and Service Contracts, plus other things as well.
- MiscHelpers – Contains various helper classes.
- ServiceModelEx – System.ServiceModel is the Microsoft assembly for WCF. ServiceModelEx is the IDesign library of extensions to WCF, plus productivity boosters. ServiceModelEx components are used to host the WCF service, and also used to resolve complex DataContract definitions between the client and server, without having to resort to the cumbersome KnownType attribute that tightly couples code and can be an irritating source of maintenance problems.
- App.config Files – Both the DataFeedsCloudWorker and DataFeedsServiceHost instantiate and open the ServiceModelEx QueuedServiceBusHost, which uses information in their app.config files to connect to the Service Bus ingestionqueue via the NetMessagingBinding. In a similar way the DataSourceSimulatorClient uses the ServiceModelEx QueuedServiceBusClient (via the DataFeedsClient proxy) which depends upon an app.config file’s settings for the NetMessagingBinding. A production app would likely use code for these purposes, although app.configs work well during early development to enable fast progress.
To run the code, first install it on your system and follow the Setup Instructions that come with the code. Then run the DataSourceSimulatorClient to enqueue a few messages. Follow that by running the DataFeedsServiceHost to dequeue them. Then take it from there, also running the DataFeedsCloudSvc with the debugger/Azure Emulator, and finally deploying it to Azure. The Setup Instructions indicate how to run the above.
When inspecting the code, notice the following areas that deal with increasing productivity:
- No changes to the WCF service code are required in order to host it in either of the 2 above hosts. The WCF DataFeedsManager encapsulates the business logic separately from hosting code. This separation of concerns minimizes the amount of things to be changed when re-hosting. The rise of hybrid apps and cloud computing has increased the need to have flexible hosting. Hosting has become a common area of volatility needing to be well encapsulated to prevent code changes from rippling through the code base, again and again.
- The NetMessagingBinding takes care of most of the details of dealing with the ingestionqueue. This means you do not have to write plumbing code to enqueue, dequeue, and convert a BrokeredMessage item of the queue to and from the types used in the app (the TestMessage and SbMessage types in this case). Plumbing code creates no direct business value. It is there purely to connect chunks of business logic together. All of the plumbing work required is done behind the scenes by WCF, the binding, plus the GenericResolver as follows.
- The work of resolving types between WCF services and clients by manually applying the KnownType attribute correctly (and chasing bugs when doing it incorrectly) is avoidable by using ServiceModelEx’s GenericResolver. This is one good reason to choose WCF over using the native Service Bus queues directly. Save time, get done sooner.
For a complete description of the GenericResolver see “Known Types and the Generic Resolver” by Juval Lowy in MSDN Magazine. In order to host a WCF service in Azure or in IIS, and use the GenericResolver, you need to prefix the name of each assembly containing types that need resolution with “App_Code.”. More specifically, you need this prefix when the hosting process has any of the following names “w3wp”, “WevDev.WebServer40”, or “WaWorkerHost”. In this article’s code example such prefixing has been done to the Shared assembly by naming it App_Code.Shared in the properties page of the Shared project. This makes the GenericResolver work when running in an Azure WorkerRole (in the WaWorkerHost process).
Also note that I have applied the GenericResolverBehavior attribute to the DataFeedsManager WCF service since it depends upon complex types. This insures the GenericResolver will be installed regardless of the service host used, although the example code uses ServiceModelEx’s QueuedServiceBusHost and QueuedServiceBusClient which install the GenericResolver by default. Please see the ServiceModelEx source code and above article for more information on this.
In the case that you have a Data Source client that cannot or does not use a WCF proxy to enqueue items, it is still possible to have a WCF Service use the NetMessaging binding to service an ingestionqueue containing BrokeredMessages. This scenario fits cases where clients must use the Service Bus REST API, or when clients are written in other than .NET languages. Please see the blog “Receiving messages using NetMessagingBinding with custom formats” by Abhishek Lal for how to do this.
To see how the WCF OperationContext can be used with Service Bus queues to gain access to the BrokeredMessage properties, plus more minute details, please see the code example and write up by Paolo Salvatori, “How to use WCF to send/receive messages to/from Service Bus for Windows Server”.
There you have it, an on-ramp to using WCF with Service Bus queues on-prem and in Azure. I hope you find WcfNQueueSMEx as useful a learning tool as I did in developing it.
Thanks to Robert Broomandan for aiding me in getting started on this path.
dotnetsilverlightprism blog by George Stevens is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Based on a work at dotnetsilverlightprism.wordpress.com.
Welcome to the Service Oriented Applications (SO Apps) series of blog articles. This series seeks to explore what SO Apps are, their benefits, plus effective techniques for building them, in blog sized code examples. My top goals for the SO Apps series are:
- Use simple “code skeleton” examples to explore various concepts, i.e. avoid dealing with business logic to focus on overall app structure, infrastructure, etc. required to productively build solid SO Apps.
- Explore developing apps that can operate in different environments, namely Azure, the on-prem data center, plus hybrid and IoT apps that span several environments. And how to best use the microservices concept.
- Use good separation of concerns at all levels to make coding more fun with less widespread, intense refactoring. And reducing time to market and costs as well.
- Explore the use of messaging (queues, pub/subs, etc.) in service oriented apps.
- Identify and demonstrate the use of productivity enhancing techniques (e.g. SOLID) and libraries.
To facilitate the SO Apps series I’ve started a personal GitHub account that contains code examples for each blog article. Please visit https://github.com/gstevens-SoApps/Blog1-WcfNQueueSMEx to see the code for the first article in the series. The first article is “SO Apps 1, WcfNQueueSMEx – WCF NetMessagingBinding In-Azure and On-Prem“.
I hope you find exploring this topic as useful as I have.
dotnetsilverlightprism blog by George Stevens is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Based on a work at dotnetsilverlightprism.wordpress.com.
Soon after I posted my April 2015 blog — Project Design: Next-gen Project Planning Technology — Juval Lowy, the originator of Project Design, sent me a list of the key people whose ideas influenced him during the 14 years he developed Project Design.
The focus of my April 2015 blog was upon software projects being dynamic non-linear chaotic systems, and thus expected to routinely behave in exponentially unpredictable ways. However, the ability to effectively deal with chaotic systems is only a small part of the capabilities of Project Design, and the story behind it. These capabilities and that story are told in brief by Lowy’s list of the key people he used to provide a strong foundation for Project Design.
With permission from Lowy I quote his list below in its entirety. Items enclosed in quotes are from Lowy. My editorial comments are marked with “George”. Below are the pillars in the foundation of Project Design method, the next-gen software project planning technology:
“Here is the list of giants, in chronological order:
1871, Helmuth von Moltke, the Elder, Germany: Designed a system of options for agility and rapid execution:
http://en.wikipedia.org/wiki/Helmuth_von_Moltke_the_Elder“
George – Please see the article’s section “Moltke’s Theory of War” for this topic. Was this the first large scale use of an agile process? “Moltke’s main thesis was that military strategy had to be understood as a system of options since only the beginning of a military operation was plannable”, Wikipedia. This is now popularly stated as “no plan survives contact with the enemy”, Wikipedia.
———
“1949, Admiral Rickover, US Navy, structured approach for project design, critical path, floats as the only way to assign resources, get repeatable projects, reign in complexity, contain costs:
http://en.wikipedia.org/wiki/Hyman_G._Rickover“
———
“1967, Sydney Opera House fiasco, then recovery using large scale modeling of critical path as network of networks:
George – The initial attempt at designing, planning and building the Sydney Opera House was a project planning disaster. Initial plan was for 4 years. Actual was over 14 years! The project was rescued by 3 Australian engineers.
———
“1970, James Antill, Ronald Woodhead in Australia capturing the lessons; this later matured into a full methodology with project crashing, advanced techniques in project design:
http://www.amazon.com/gp/product/0471620572/ref=wms_ohs_product?ie=UTF8&psc=1#_“
George – This is called Critical Path Methods in Construction Practice by the above named authors. It uses the Sydney Opera House as one of many case studies they examined to find useful techniques and methods of project planning.
———
“1972, David Parnas, project design stems from a software design that does not change when requirements do. He was first to propose such a design:
http://en.wikipedia.org/wiki/David_Parnas“
George – A key idea of Parnas critical to a software architecture design that does not change when requirements change is known as “Information Hiding”. Please read the first paragraph of the following link to understand the general idea that Project Design stems from http://en.wikipedia.org/wiki/Information_hiding, in addition to the above link.
———
“1979, Daniel Kahneman, Prospect Theory, the best way to evaluate options is the risk, not to the utility value:
http://en.wikipedia.org/wiki/Prospect_theory“
George – This is why Project Design objectively measures several kinds of risk in a project plan.
———
“2010, Fredrick Brooks, designing the design, captures best the environment and process required for projects in general:
George – This is a book called The Design of Design: Essays from a Computer Scientist.
———
Then there is a variety of more general ideas such as complexity theory, dynamic systems, antifragility, quantifying risk, the math I used to bring it all together and more. “
Thank you, Juval Lowy, for sharing the foundations of Project Design with us!
Disclaimer – In no way am I being compensated for the opinions I have expressed in this article. I have taken the Project Design course and subsequently have spent over 5 intense weeks learning how to effectively use Project Design on a substantial learning project. Through this experience I have come to believe that the diligent use of Project Design will result in an exceptionally high rate of projects being delivered on schedule, on budget, and on quality. Thus, to me, Project Design is worthy of blogging about since it has the potential of significantly increasing the productivity of software development and customer satisfaction, which are main themes of this blog.
George Stevens
In early 2013 a new technology for software project planning was made available to the public. It’s called Project Design and is available through a week long course given by IDesign. Project Design was conceived of and developed by Juval Lowy, the principal of IDesign and the course instructor.
The Project Design technology holds high promise of facilitating a very high success rate for software projects that use it, in terms of the project being on schedule, on budget, and on quality.
The underpinnings of Project Design are mathematical models of software projects, coupled with a strong engineering approach to identify and solve the problems around effectively planning project schedules and budgets. While working as an Enterprise Architect for a Fortune 100 company in the late 1990’s Lowy started experimenting with using mathematical models to understand and control the project plans on his software projects. He discovered that he could indeed use such models to understand the underlying dynamics of a project, and leverage this to gain a much higher degree of control over the project outcome.
After a few years of experimenting with mathematical models, the Project Design technology emerged. Since then Lowy has been using Project Design and refining it since 2000, with an exceptionally high rate of success in the dozens and dozens of projects he has done over this time.
The general class of mathematical models Lowy found that fit software project plans are known as dynamic non-linear chaotic systems. These models are sometimes known as the “Butterfly Effect”, e.g. “A butterfly flapping its wings in Nicaragua caused my software project to slip its schedule.”
Now don’t try blaming your project failures on butterflies! Instead understand the inherent instability of chaotic systems, and that Project Design allows you to pin-point the sources of instability and take action to make them much more robust.
A key characteristic of dynamic non-linear chaotic systems is that a tiny change in initial conditions can result in vastly different outcomes. This is in part due to their characteristic of “emergent behavior”. In other words “if we start with only a finite amount of information about the system…, then beyond a certain time the system will no longer be predictable”, Wikipedia, “Chaos theory”. The subsequent unpredictable behavior is emergent behavior.
Sunny Auyang’s deep, but understandable, paper on chaos theory, “Non-Linear Dyanmics: How science comprehends chaos”, demonstrates this in a nutshell with this drawing.
Here is how Sunny’s paper explains the drawing: “Given an initial condition, the dynamic equation determines the dynamic process, i.e., every step in the evolution. However, the initial condition, when magnified, reveals a cluster of values within a certain error bound. For a regular dynamic system, processes issuing from the cluster are bundled together, and the bundle constitutes a predictable process with an error bound similar to that of the initial condition. In a chaotic dynamic system, processes issuing from the cluster diverge from each other exponentially, and after a while the error becomes so large that the dynamic equation losses its predictive power.” Quoted with permission from Sunny Auyang.
Long story short – Software project plans are dynamic non-linear chaotic systems. Therefore the basic nature of software project plans is that they are inherently unstable, characterized by seemingly small changes in initial conditions giving rise to large divergences and unpredictable outcomes.
No wonder we have such trouble with software project planning, up against behavior like this! I find the below diagram aids in visualizing the dynamics of a project.
Without knowledge of this kind of behavior, there is a significant probability that any project will end up taking one of the exponentially divergent paths in the above diagram. The result of following such a divergent path is being behind schedule, over budget, and/or having low quality.
Wouldn’t it be better to set your project up ahead of time to follow one of the more direct paths?
The good news is that, even though they are dynamic non-linear chaotic systems, we can control the behavior of software projects. Lowy’s Project Design greatly facilitates this. It can be done initially by setting up robust initial conditions that do not foster emergent behavior so quickly. And it can also be done by closely monitoring the project for trouble signs, then quickly taking action to correct the trouble. Project Design allows one to understand the dynamic system, to know what variables you can control in the plan, to know the effect each variable has on the overall system, and also provides a means for getting feedback as the project progresses so one can make accurate adjustments. Now you can make a dynamic project plan system do whatever you want!
By way of summary, in developing Project Design over 2 decades Lowy has discovered 1) how to effectively model software project plans, 2) how inherently unstable they are, and (even better) 3) a set of actions that can be taken to stabilize project plans, plus ways of objectively comparing different project plan options to increase the success rate. And, it is compatible with Agile development processes!
Below are some links that provide more detail about Project Design, the class offered by IDesign, and testimonials by people attending the class.
The Project Design Master Class
http://www.idesign.net/Training/Project-Design-Master-Class
Video – “What is Project Design?”
http://www.youtube.com/watch?v=IOrU57SiBIg
What Juval Lowy says about Project Design – “Project Design – A Call for Action”
http://www.idesign.net/articles/Project-Design-Call-For-Action.htm
Video – “Introduction to Project Design” by Juval Lowy
http://www.youtube.com/watch?v=dCYOItESx0Q
And for my blog article about the sources of the key ideas Lowy used to develop Project Design please see Foundations of Project Design: More than Chaos Theory.
Disclaimer – In no way am I being compensated for the opinions I have expressed in this article. I have taken the Project Design course and subsequently have spent over 5 intense weeks learning how to effectively use Project Design on a substantial learning project. Through this experience I have come to believe that the diligent use of Project Design will result in an exceptionally high rate of projects being delivered on schedule, on budget, and on quality. Thus, to me, Project Design is worthy of blogging about since it has the potential of significantly increasing the productivity of software development and customer satisfaction, which are main themes of this blog.
P.S. Sunny Auyang has numerous interesting papers and sketches of her books about science and engineering on her website at http://www.creatingtechnology.org. Thanks for sharing them with us!
dotnetsilverlightprism blog by George Stevens is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Based on a work at dotnetsilverlightprism.wordpress.com.
Yes, that is exactly what I mean — Where, not what. This article seeks to get you thinking about your thinking. Be self-reflective and greatly benefit.
This is the second article in my “Where Are You Thinking” series. The initial article was written in December 2013 — https://dotnetsilverlightprism.wordpress.com/2013/12/08/where-are-you-thinking/
The first article aimed at getting you to think about your thinking in regards to what level of abstraction you are thinking in terms of at any moment – Concept Level, Interface Level, or Implementation Level. And it encouraged you to use this awareness to consciously adjust “Where your thinking is” to the needs of the task at hand. Not thinking so much about implementation details when engaged in high level design, for example.
Part 2 of “Where Are You Thinking” is about historical time – Are you thinking in terms of the concepts of the past, present or future?
As in Part 1, the goal here is to give you conceptual tools to aid you in focusing your thinking so you and your team can be much more productive in doing software engineering that produces lasting value.
In February 2015 I posted a blog “Waves of Technology Change: Grab Your Surfboard or Life Jacket?”https://dotnetsilverlightprism.wordpress.com/2015/02/09/waves-of-technology-change-grab-your-surfboard-or-life-jacket/
That post identified a number of different technology waves that are currently in various stages of washing over us. Some of these waves are: the Mobile Devices Wave, the Cloud PaaS Wave, the Big Data Wave, the soon to hit IoT Wave, plus the very long wave length Distributed Computing Wave. Stepping back, the Mobile Devices, Cloud, Big Data, and IoT waves can all be viewed as aspects of the Distributed Computing Wave that also encompasses Service Orientation.
The “Waves of Technology Change” article got me thinking – While some of the concepts and ideas I regularly use as a software engineer are useful in the current environment, others are solidly rooted in the past and out of sync with recent changes! I call them fossil concepts — concepts applicable to yesterday, but not to today or tomorrow.
At one time in the past my current fossil concepts were highly useful. But not now. The rapid advance of the Waves of Technology Change in the past few years has made some of my “trusty old friend” concepts quite out of date. In fact, thinking in terms of fossil concepts can quickly get one in to trouble when developing new software systems.
Here are some of the things I’ve learned to do to identify my fossil concepts, and find good replacements for them.
First, I completely accept the fact that in the past my fossil concepts were an important part of my professional software engineering process. And I also accept that I need to keep my “operating set” of software engineering and technology concepts current in order to produce good value. That is the way life is these days. Fighting technology change does not work very long.
Second, I identify the areas having the biggest and most far reaching changes. These areas point to where I need to look at my concepts to identify fossils. The “Waves of Technology Change” article is pointing me at the following areas that are my highest priority right now. However, I am sure there will soon be other areas requiring this scrutiny.
What are the most far reaching areas of technology change in your work that you need to learn about?
My 2 top priority areas of the biggest and most far reaching technology changes are as follows:
1. Distributed Systems and Service Orientation, including Mobile and IoT Devices
Here, lots of new concepts are needed to effectively deal with the massive changes that are taking place. Specifically, we need to start thinking in much larger terms than just Apps alone. Concepts like Rachel Hinman’s “device ecosystems” interacting with each other through various “touch points” in the overall solution architecture are now highly useful, rather than focusing upon a single App. That came from her 2012 book “The Mobile Frontier” a review of which is at https://dotnetsilverlightprism.wordpress.com/2013/01/01/book-review-of-the-mobile-frontier/
Monty Montgomery has a great article about thinking in terms of Systems rather than just Apps. And it has some really, really useful high level diagrams! Escaping Appland
For my part I find the following concept useful — It pulls in both Rachel and Monty’s ideas:
- Apps are directly used by people, anywhere. Apps use Services, anywhere. Services also use other Services, anywhere. Systems are the network of inter-dependencies created by Apps using Services, and the Services using other Services.
- And you can superimpose your own boundaries or mappings of other concepts on this network to suit your own purposes. Here I am thinking of the vital ideas concerning the granularity of services and the fractal nature of software noted by Monty Montgomery in his article cited above.
2. The Cloud
There are lots of new concepts about the Cloud that we need to incorporate into our thinking. Examples are:
- Eventual Consistency, Transient Faults, and executing Error Compensation use cases rather than relying on distributed transaction rollbacks.
- The declarative approach Azure App Service uses with APIs, Logic, and WebApps. Is there a new programming model emerging here?
- The availability of almost unlimited compute power to do tasks or sift through oceans of data.
- Plus lots more to come.
The Cloud is also a place to watch closely for the emergence of disruptive technologies – game changers. The associated concepts and ideas will be vital to know about.
The third way I identify my fossil concepts and find replacements is to do a thought experiment, attempting to project my thinking 5 to 7 years into the future. Here’s the drill:
Imagine you are on a team just beginning to design and build a new App and its supporting Services (a System). And, the technologies to be used have not yet been selected. It will take about 1.5 to 2 years to build the system and deliver it. So we’ll deliver in 2017.
Now ask this question: “Five years past delivery (2022) how easy will it be to readily find developers highly skilled in the technologies the system was built with?” Such developers will be required to maintain the system for a number of years into the future, till the end of the life of the system.
At that point, seven years will have passed since the technology was selected. Thinking this question through for UI technologies, Server technologies, database technologies, etc. is a thought experiment that will teach you a lot.
And please do not forget to consider free open source technologies versus technologies supported by vendors. Will you have any trouble finding developers up to speed on using a 2015 “in fashion” free technology seven years down the line?
The current period has some of the most interesting technology changes occurring since software development started. Your “fossil concepts” have served you well, but at some point they get in the way. Learning about the new technologies and their underlying concepts can be a fascinating and rewarding experience.
dotnetsilverlightprism blog by George Stevens is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Based on a work at dotnetsilverlightprism.wordpress.com.
Over the past 5 years I’ve followed the Indeed Top Job Trends, informally checking them several times a year. They have provided me with insight on the current software and computing technology changes. You can see the job trends for yourself at http://www.indeed.com/jobtrends. These charts show the relative demand in the job market for various technological job skill keywords in Indeed’s job ads. In July 2013 I posted my first article about these trends at https://dotnetsilverlightprism.wordpress.com/2013/07/27/net-single-page-applications-spa-helpful-info-sources/.
Herein I use the mental model of waves of technology change washing over us, similar to the waves one sees at the beach when the surf is up. While over simplified, it is a useful model for conceptualizing of how we are impacted by multiple technology changes. Each wave of technology change proceeds at its own pace over time, yet also interacts with some or all of the other waves of technology change in process to a greater or lesser degree. Thus, the intersection of several of these waves reinforces each other, causing larger amplitude of change, which has more impact and disruption of the status quo.
Please understand my goal with this mental model is pragmatic. I am not using scientific techniques to clearly identify what is a wave and what is not. Look at the charts below and you’ll see the waves. Rather, my goal is to use the wave metaphor to 1) inform you of the changes taking place via charts of relevant data, and 2) to set up for subsequent blogs that deal with techniques one can use to thrive amidst multiple big waves. Such disruptions increase the uncertainty and risk of making business and technology decisions. Understanding these dynamics is part of the answer to managing risk and uncertainty. And these uncertain times of change will be with us for a number of years to come, as the length of the below trend lines portend. This short video clip summarizes our predicament — one wave after another https://www.youtube.com/watch?v=yK0hIWmwtd8 . Should I be reaching for my surfboard or my life jacket?
To effectively deal with such changes I support finding ways to manage risk, but also to exploit some of the technology changes at an acceptable level of risk. In future blogs I’ll state more specifically how to do this. It involves education, a willingness to embrace change, a planned approach to development, driving out unknowns and risks earlier rather than encountering them too late, plus the design techniques of service orientation, decoupling things so they can vary independently, and encapsulation of areas of significant change. This approach provides both life jackets and surfboards. These techniques offer a significant source of stability in uncertain times, despite the many waves of change washing over us.
Over the past 4 decades the below big waves of computer technology change have generally proceeded one at a time:
- Mini-computers their software displacing mainframes in the 1970s.
- The rise of PC’s, desktop software, and workstations in the 1980s.
- The rise of the World Wide Web in the 1990s.
- The rise of the .NET and Java frameworks and their support of widespread distributed computing in the 2000s.
But now, in the 2010s (the “teens”) we have several really big waves of technology change proceeding concurrently, or least in very rapid succession, listed below. Note that it is never clear how big a wave will be until it has run at least part of its course. Therefore the below list is my best guess at the current time, aided by the Indeed job trend data presented subsequently:
- New Web wave – New in the “teens”. HTML5 and jQuery continuing and accelerating the 1990’s rise of the Web.
- Mobile wave — New in the “teens”.
- Big Data wave – New in the “teens”.
- Cloud wave – New in the “teens”.
- Internet-of-Things (IoT) wave – New in the “teens”?
- There could be other waves as well, like the Distributed Computing wave which is still at work and partially included in the Cloud wave. However, I am limiting this list to items found in the Indeed Top 10 Trends and the Distributed Computing wave is not in it. However, I am adding the the Internet-of-Things wave since its scope and effect are so vast, perhaps larger than Mobile as it is today. I’ll have more on that topic in a subsequent blog someday.
Wow! We now have at least 4 new big waves of technology change to deal with. Any one of the above waves is challenging for enterprises and professional developers to deal with by itself. But when a number of the waves start washing over us all at once or in rapid succession things become much more complex. Is it time to grab for your surfboard, or your life jacket?
What does the Indeed job market demand data tell us about these waves of change?
Top 10 Job Skills Keywords and their Categories
The top 10 desired job skill keywords found in the Indeed job listings are listed below. I have provided explanations of the terms that may not be commonly known by software technologists.
- HTML5
- MongoDB – A popular NoSQL database.
- iOS
- Android
- Mobile app
- Puppet – A Linux deployment utility.
- Hadoop – A Big Data analyzer, using Map Reduce algorithms and special query languages called Hive and Pig to query and analyze tons of data in parallel, often using many, many, many virtual machines to do so.
- jQuery
- PaaS – Cloud Platform-as-a-Service capabilities.
- Social Media
Categorizing the above items into functional categories we have the following waves (in bold) and their main components:
New Web
- HTML5, jQuery, Social Media
Mobile
- iOS, Android, Mobile app
Cloud
- PaaS, MongoDB?
Big Data
- Hadoop, MongoDB?
Linux Deployment
- Puppet
Categorization Notes:
- MongoDB appears in both the Cloud and Big Data categories. The job ads show the MongoDB technology is widely used, and may well be appropriate for other categories as well. Perhaps there is a NoSQL wave I omitted.
Analysis
First, note that a year ago the list contained much the same items as it does now! That, in and of itself, is quite interesting. The technologies on the list today that were also on last year’s list are:
- Mobile — This wave has not yet peaked, although some sectors may have.
- New Web — This wave is well beyond its peak, with things returning to a new equilibrium.
- Big Data — This wave is continuing its advance and has probably not peaked.
The new items on the list this year are PaaS and Puppet. PaaS’s presence on the list this year announces that the Cloud wave has just started to hit us. It is no longer “out there” at sea, but it is beginning to have a significant impact on the job market, and upon the technology we need to deal with in the high tech industry. Note that in a 2/9/15 Investor’s Business Daily article “Amazon Cloud Business Reigns; Google, Microsoft Make Impact” it was stated that “only 8% of the businesses worldwide that could move to the cloud have done so, says IDC”. That confirms there is a whole lot to come in the Cloud wave.
I am not sure what Puppet’s presence on the list means, being unfamiliar with the details of the Linux world. Nor can I discern whether it is “wave worthy”, i.e. can a single deployment technology be considered a wave? Perhaps it is part of something else.
And, there is no evidence as yet of the Internet-of-Things wave as an entity clearly visible as being separate from the Cloud wave of change, upon which IoT depends. That wave remains “out there”, looming on the horizon.
Looking at the individual graphs of these technologies over time is instructive. All of the below charts show the “absolute percentage of matching” job postings containing the key word. These charts give one a sense whether a particular wave has peaked yet, or whether a wave is past its peak.
If a wave has not yet peaked you can be sure the effects of that wave will continue rippling out causing continued disruptions of the status quo. After all, isn’t that what change is? Once a wave has peaked, however, the charts typically show a period of stability in market demand in a sideways consolidation. During this time the industry and consumers are establishing a new status quo. This likely takes several years.
First let’s take a look at the graphs of technologies that are in strong up-trends, and thus have not peaked. I’ll deal with those in consolidation patterns later. I define a “strong up-trend” as the curve being at, or above, the obvious trend line (a straight line drawn connecting the low points of the up-trending curve).
These 3 graphs say that the technology changes associated with Mobile Apps, Puppet, and the Cloud Platform-as-a-Service (PaaS) are still gaining strength and have not yet peaked, or even slowed down that much in the past FIVE years. And, both Mobile and the Cloud showed strong acceleration in 2014. These are very strong, long duration waves of change!
It would be interesting to see what the strongly trending key words were in the top 20 or 30 Indeed items, rather than just the top 10, and how long those trends have been going on. That data is not readily available from Indeed.
What about the technology change waves that are not strongly trending right now, but in consolidation patterns?
First let’s look at the items whose curves, below, are in the shortest consolidation patterns, having peaked roughly in 2013 – Hadoop, MongoDB, and HTML5. All have been in sideways consolidation patterns for roughly 2 years and have been strongly trending for 5 to 6 years.
The 2 Big Data items may currently be “breaking out” of the consolidations to resume their up trends. That will become evident in 2015. This says the Big Data wave may not have peaked yet. I’ll bet it has not, but time will tell. However, the demand for HTML5 may well have peaked since the consolidation is much deeper than either of the Big Data items.
Below, take a look at the items in the longest consolidation patterns, or even in possible down trends. All of the following technology change waves peaked in 2012. And, their trend durations were 4 to 6 years, with the exception of iOS which had the shortest trend duration of 2 and a half years since the iPad’s introduction happened in 2010.
Note the difference between the Mobile App curve above in the strong trend section and the below iOS and Android curves in 3 year consolidations. The Mobile App category is picking up some things that are not iOS and Android. Maybe Xamarin? Try an Indeed search for Xamarin in the Trends page and you’ll see an extremely strong uptrend for the past 2 years (not shown). That is likely where the bulk of disruption is in the Mobile wave at this time, although it bears further investigation.
The wave in jQuery has peaked. And the trend in Social-Media has plateaued, if not peaked.
So there you have my overview of the waves of technology change we are dealing with right now. I hope this will provide you some perspective that aids you in your decision making this year.
Now about grabbing a surfboard (so you can attempt to reap gains from a wave by riding it) or your life jacket (so you can manage the risk of the disruptions): I say, use both! After all the dedicated big wave surfers that ride huge 40+ foot high waves at Maui’s Jaws (aka Peahi) typically use an inflatable life jacket when they surf those giants since it’s a game of survival in that environment.
And so is it also a game of survival for many businesses in these times of great change, having to replace or refurbish old software systems in a time of great technological uncertainty. It is easy to make mistakes and waste time and money in such an uncertain, yet competitive, environment. Using a few time proven principles of software engineering goes a long way in preventing such mistakes.
Hth
dotnetsilverlightprism blog by George Stevens is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Based on a work at dotnetsilverlightprism.wordpress.com.
Automated workflows can be thought of as long running processes — Processes that may take minutes, hours, days, or even weeks to run to completion. As such, workflows present challenges not encountered in “normal” software development. Also, some workflows involve user interaction, while others may not. And, workflows can be implemented in WCF, the Windows Azure Cloud Services (PaaS), as well as through specialized workflow frameworks like Windows Workflow Foundation. The patterns presented herein are aimed at WCF and Azure implementations.
Over the last 2 years my interest in the design of automated workflows has been inspired by several architecture courses I have taken from Juval Lowy’s company, IDesign. While the courses themselves have been invaluable in me learning the design of service oriented applications, the IDesign Alumni Forum (on Google Groups) has proven to be a great way to learn additional concepts and techniques as well.
Membership in the Forum is available by taking one IDesign course. The contributors to the Forum’s various threads include industry experts (like Juval Lowy, Monty Montgomery from IDesign) as well as other highly experienced architects and developers working in our business. Thus, one can ask questions and get answers from knowledgeable people. And, one can also just sit back and read and learn. The knowledge I have gained from the Forum has directly benefited me in my professional work in many ways!
In this article I present a summary of some powerful patterns that can be used in the design of automated workflows. These include design patterns, messaging patterns, and workflow patterns. I thank Juval Lowy for his feedback on the second part of this article.
A Summary of Powerful Workflow Patterns
Below are sketches of patterns helpful in sequencing the steps of workflows. Most of these patterns are concerned with ways of decoupling things and/or producing maintainable code. I won’t dive into the details here in this summary. Workflow patterns are a world unto themselves. However, I hope the following list points you in the right direction to get you started.
State Machine – The full name is Finite State Machine. The purpose of using a state machine is that they have the power to facilitate very complex interactions while also producing highly maintainable code. High value! See the Wikipedia for more details at http://en.wikipedia.org/wiki/Finite-state_machine.
- For a workflow, a state machine can control the sequence of the execution of the steps in a workflow. Here’s a sketch: It is all based upon the concept of the current state of the workflow and an “event” that is input to the state machine (e.g. a system input or user input). The “event” causes the state machine to automatically transition to the next state. The next state is predetermined by a mapping between “events” and “next states”, given a current state. This mapping is often stored in a lookup table called a “state transition table”. The work of an individual workflow step is accomplished during the state transition.
- State machines can be nested. A technique which adds to their power to deal with highly complex interactions, while having very maintainable code.
- A state machine is typically encapsulated within an object. There are several designs for state machines, such as the Gang of Four State Pattern and the State Table approach. Robert Martin has good examples of both in his book Agile Principles, Patterns, and Practices in C# published in 2006. There is a lot written about state machine designs so you can find out about them in other sources as well.
- State machines are also extremely useful for sequencing non-workflow things as well, e.g. UI interactions like a wizard, complex interactions between objects, hardware designs, hardware/software interactions, etc.
Messaging Patterns – The excellent book Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions by Gregor Hohpe and Bobby Woolf, published in 2003, contains several messaging patterns that can be applied to workflows to increase their decoupling and maintainability.
- Pipes and Filters – This is a simple pattern that most workflows are implicitly based upon: http://www.eaipatterns.com/PipesAndFilters.html. A version of this pattern designed for Windows Azure can be found at http://msdn.microsoft.com/en-us/library/dn568100.aspx. A code sample is available.
- Process Manager — http://www.enterpriseintegrationpatterns.com/ProcessManager.html. An example of the usage of this pattern in a Domain Driven Design architecture in a Windows Azure app can be found in Microsoft’s CQRS Journey book at http://msdn.microsoft.com/en-us/library/jj554200.aspx. See Chapter 3 for the RegistrationProcessManager. A code sample is available.
- Routing Slip — http://www.enterpriseintegrationpatterns.com/RoutingTable.html
- Message Bus — This is often used in complex messaging scenarios (like many of these workflow patterns) to decouple various message senders and receivers from each other. When using WCF you can use the IDesign ServiceModelEx PubSub utility coupled with the MSMQ binding to achieve this decoupling and the functionality of a Message Bus. The IDesign code is available at http://idesign.net/Downloads. You can see the generalized pattern at this link http://www.eaipatterns.com/MessageBus.html.
Saga – This pattern is similar to the Routing Slip, but subtly different. Note that this is not the same as the functionality called a “Saga” in several messaging frameworks. That “Saga” is really a Process Manager implementation! The true Saga pattern is useful when you need to programmatically compensate for errors in a long running sequence of independent atomic operations without the ability to use distributed transactions to wrap the sequence and do automatic rollbacks when encountering an error.
- Clemens Vasters has an excellent blog article on the real Saga pattern, complete with code at http://vasters.com/clemensv/2012/09/01/Sagas.aspx.
Scheduler-Agent-Supervisor – Somewhat similar to the Saga, the Scheduler-Agent-Supervisor pattern includes decoupled ways to check for errors and do compensation. It is expressed in terms of Windows Azure cloud services in this link http://msdn.microsoft.com/en-us/library/dn589780.aspx.
- It is worthwhile to read about this pattern, as well as the Saga, just to see how one might approach programmatic error compensation in a workflow when distributed transactions are not available. Also useful along these lines is the Compensating Transaction Pattern at http://msdn.microsoft.com/en-us/library/dn589804.aspx. The ideas demonstrated in these 2 patterns can be applied in technologies other than Azure.
Caveat Emptor – Why You Need Good Project Design (aka Planning) When Using Workflows
After considering some of the above patterns, one thing to be acutely aware of with workflows in general is that the complexity of the code implementing workflows can exponentially increase very quickly! Why?
- Workflows involve asynchronous calls.
- Workflows can seldom use distributed transactions to wrap the execution of a sequence of workflow steps. Sorry, no transaction rollbacks are available that make coding easy. Therefore manual and/or programmatic compensation must be built into the workflow design and implementation. And manual and/or programmatic error handling and compensation often results in more use cases to do these things. Often it is necessary to write additional workflows dedicated to doing error compensation for other workflows, including options for manual input and assistance in compensation. All this will surely result in lots more code to design, implement, and test.
- Workflows cannot be tested nearly as easily as can normal, synchronous sequences of calls. The async nature of their sequence will make testing much, much more involved, as will testing the error conditions since it involves manual and/or programmatic compensation. Essentially, one must “speed up time” in order to efficiently test long running processes.
- Workflow code needs to be rock solid since it can be very, very time consuming to debug. Therefore, a complete analysis of potential failure points and scenarios needs to be done, and the design and implementation of the workflow and its parts need to “fail safe”. This can greatly expand the amount of code required. However some of it can be pushed down into reusable infrastructure.
- The requirements and definition of workflows typically involves a number of people in different roles – Customers, users, product owners, analysts, architects, developers, quality assurance, etc. Not only is it time consuming to get things nailed down, but the number of possible permutations of a workflow can be vast, even for a fairly simple system.
So, if you are going to use the great power of workflows in your apps you need to design your project plan carefully to include all the work it takes to build solid workflows — Ramp up the learning curve, design, implement, and test the workflows. This time may be surprisingly large, considering the above points.
I hope you find this article’s references as useful as I have.
dotnetsilverlightprism blog by George Stevens is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Based on a work at dotnetsilverlightprism.wordpress.com.
My previous blog article “Build Cloud Apps that Deliver Superior Business Value” of March 16, 2014 lists 4 references that were quite helpful in shifting my perspective to understanding what it takes to build scalable, failsafe “cloud-native apps”. A “cloud native application” is an app whose architecture and design has been guided by software engineering practices repeatedly used in highly successful cloud apps. [Wilder, p ix]. The body of knowledge required to build cloud-native apps is generally described in the 4 sources listed in my previous blog article.
Upon reading these 4 sources, plus others I have read since, it becomes clear that it takes significantly more effort to develop failsafe, robust cloud-native apps when compared to the effort typically involved in developing a functionally similar normal app hosted in a data center. Beyond failsafe, it takes even more development effort to make the cloud-native app highly scalable.
To aid in more accurate estimation of the development effort for cloud-native apps, this article aims to facilitate understanding the root causes of why that development effort is significantly larger. Plus it also identifies explicit areas requiring a larger effort.
As developers coming from developing normal apps that run in data centers, we have all developed a set of expectations that guide our ideas of what it takes to develop apps. In early 2014 as I began coming up to speed on developing apps for Azure I found my “normal data center” expectations being violated time after time due to the following underlying characteristics of the cloud: Multi-tenancy, Commodity Hardware, and Programmatic Error Compensation versus Transactions to Deal with Errors. Thanks to Bill Wilder’s Cloud Architecture Patterns for providing some of these categories. Below is a sketch of how each category impacts the effort required to develop robust cloud-native apps.
Multi-tenancy
We often think of multi-tenancy as in an SaaS app, where multiple organizations share the capabilities of the app such that a number of users in each organization can use the SaaS app concurrently without “getting in each other’s way”. The same multi-tenancy concept is used for many of the basic services offered by cloud platforms as well.
Do you think that PaaS load balancer your cloud app is using belongs only to you? No. Likely it is a multi-tenant load balancing cloud service, shared by other cloud apps as well. The same applies to many other PaaS features, like worker roles, web roles, data storage, identity management, etc. [Wilder, pp 77 – 79]. It may well apply to IaaS features as well.
So, when an individual multi-tenant resource becomes overloaded and slows down, or becomes temporarily unavailable due to the cloud “fabric” itself shifting some of the overloaded resource’s users to another less loaded resource behind the scenes, what do you need to do to ensure your cloud app remains robust and responsive?
Design, write, and test code in your app to deal with this situation. Some of the patterns listed in the previous blog article’s 4 references that deal with this are: Auto-Scaling, Busy Signal, Throttling, Retry, and Circuit Breaker for starters.
Commodity Hardware
Our data centers and much of their hardware have been designed to minimize the Mean Time Between Failure (MTBF). We don’t want our hardware crashing so we use fail over hardware designs. We use high-end (and expensive) hardware like RAID disk drives, etc. As a result, we have deep seated expectations that hardware failures are rather rare. And our software designs and development techniques reflect this expectation.
One main reason cloud computing is often more cost efficient than data center computing is that clouds rely upon commodity hardware that has a high value-to-cost ratio. But, there is a higher probability that commodity hardware will fail more often. Therefore, cloud computing focuses on minimizing the Mean Time to Recovery (MTTR) rather than minimizing the MTBF [Wilder, pp 79 – 82]. This is a completely different dynamic from what developers are used to in data center apps.
So what do you need to do to make your cloud app robust (not crash) and responsive in the face of significantly more frequent hardware failures?
Design, write, and test code in your app to deal with this situation. Some of the patterns listed in the previous blog article’s 4 references that deal with this are: Node Failure Pattern, Busy Signal, Retry, Circuit Breaker, and perhaps Health Endpoint Monitoring for starters.
Programmatic Error Compensation versus Transactions to Deal with Errors
Most of us routinely use transactions as a way of compensating for errors when developing traditional apps. An exception occurs. No problem, just design the software to rollback the transaction that wrapped the operation in progress and all the changes thus far made are gone. Easy!
In the cloud one cannot always rely upon transactions as the common means for compensating for errors. Why? Part of the reason is that some cloud resources do not support transactions! Nada. You will need to closely examine this when selecting the kinds of resources you plan to use. Does that data storage support transactions? Maybe not! How about that queue you want to use? Some do not support transactions, while others support them only in certain limited configurations.
Another part of the reason for not using transactions is the Eventual Consistency that is common in the cloud. How can one do a transaction on an operation that involves an eventually consistent piece of data? Hmmm… For more on Eventual Consistency and Data Consistency please read Cloud Architecture Patterns or Cloud Design Patterns: Perscriptive Architecture Guidance for Cloud Applications http://msdn.microsoft.com/en-us/library/dn568099.aspx
Without transactions to do error compensation automatically, how will you compensate for errors?
Design, write, and test code in your app to deal with this situation. Some of the patterns listed in the previous blog article’s 4 references that deal with this are: The Compensating Transaction pattern and the Schedule Agent Supervisor pattern, plus the primers concerning consistency. These 2 articles are also key: “Failsafe: Guidance for Resilient Cloud Architectures” http://msdn.microsoft.com/en-us/library/azure/jj853352.aspx and “Best Practices for the Design of Large-Scale Services on Windows Azure Cloud Services” http://msdn.microsoft.com/en-us/library/azure/jj717232.aspx.
This is only a partial list of areas that require more code to be developed in a cloud-native app. Other areas are Scalability (including the awesome capability to auto-scale), Health Monitoring, Instrumentation and Telemetry, and Service Metering. Perhaps throw in deployment as well.
Bill Wilder summarizes the situation by saying “Architecting to deal with failure is part of what distinguishes a cloud-native application from a traditional application” [Wilder, p 82].
Above we have seen that cloud-native apps need a substantial amount of code to be designed, written and tested that is not required in a normal app running in a data center. Much of this “extra code” has absolutely nothing to do with providing the basic functionality of the app itself. Rather the “extra code” is required to make the app highly useable in the cloud. In other words, it is code to ensure the app meets its supplemental requirements. Estimation of the effort to develop this “extra code” needs to be done in addition to the estimation of the effort to develop the code that produces the apps functionality.
Please note that many of the areas requiring more code, and many of the patterns mentioned above, are amenable to code reuse. Much of this “extra code” is infrastructure code that can be reused in the form of libraries. That’s good news for organizations developing a steady stream of cloud-native apps, so that the full amount of “extra effort” may not have to be expended for each cloud-native app developed.
The following links will provide the reader with code examples of some of the above mentioned solutions to robustness and scalability challenges in the cloud. Also consult Cloud Design Patterns for code examples as well.
The following links are all more or less complete apps which have downloadable code samples that show how they have dealt with many of the above areas, including the use of some of the above mentioned patterns. Taking a look at the code samples in the books, plus reading the actual source code, is invaluable at giving one an idea of the kind of complexity (and potential effort) that is involved in developing could-native apps. And, all of the books are available online for free!
- Building Hybrid Applications in the Cloud on Microsoft Azure, circa 2012, by Microsoft Patterns and Practices. http://msdn.microsoft.com/en-us/library/hh871440.aspx
- Developing Multi-tenant Applications for the Cloud, 3rd Edition, circa 2012, by Microsoft Patterns and Practices. http://msdn.microsoft.com/en-us/library/ff966499.aspx
- Exploring CQRS and Event Sourcing, circa 2012, by Microsoft Patterns and Practices. http://msdn.microsoft.com/en-us/library/jj554200.aspx
- Cloud Service Fundamentals in Windows Azure, circa 2013. A body of sample code from MSDN that contains “fundamental building blocks for scale-out Azure apps”. Built based on “real world customer learnings of the Windows Azure Customer Advisory Team (CAT)”. https://code.msdn.microsoft.com/Cloud-Service-Fundamentals-4ca72649
References
- Cloud Architecture Patterns by Bill Wilder, Copyright 2012 by Bill Wilder, O’Reilly Media, Sebastopol, CA
I hope you find this article and its references as useful as I have.
dotnetsilverlightprism blog by George Stevens is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Based on a work at dotnetsilverlightprism.wordpress.com.
There is a big difference between building “apps in the cloud” and building scalable, failsafe “cloud-native apps”. This difference is easy to miss. It’s often not apparent until one has put in some time dealing with cloud apps of both kinds.
The following list of references greatly aided me in changing my point of view from “an app in the cloud”, to a useful understanding of what it takes to develop “cloud-native apps” that deliver great business value — functionality, performance, scalability, resiliency, and cost effective operation.
Cloud Architecture Patterns by Bill Wilder, Copyright 2012 by Bill Wilder, O’Reilly Media, Sebastopol, CA
This is a wonderful little book, and an easy read as well. It provides you with all the basic knowledge to understand what “cloud-native apps” are; why they demand different architectures; and the basic patterns of putting them together in a scalable, robust manner. In addition to 11 patterns, it presents 4 primers that educate you on the key Cloud concepts of Scalability, Eventual Consistency, Multi-tenancy and Commodity Hardware, and Network Latency. Your architectures will have to effectively deal with all of these. If I had to choose one book to launch a successful stint developing cloud apps, I’d choose this one. It applies to both Windows Azure and Amazon Web Services, although the example code is implemented via Azure.
“Disaster Recovery and High Availability for Azure Applications” by Microsoft focus on availability and scalability, with disaster recovery a part of that. This is a an excellent document. Added on 10/21/2015.
“Failsafe: Guidance for Resilient Cloud Architectures” by Microsoft provides an extremely useful approach (a method) to identify all the things you have to do to design well performing and failsafe “cloud-native apps”. See the article at this link: http://msdn.microsoft.com/en-us/library/windowsazure/jj853352.aspx
“Best Practices for the Design of Large-Scale services on Windows Azure Cloud Services” by Microsoft, supplies more of the implementation details than the “Failsafe” document’s methodology. They go together hand-in-hand to guide you. See it at this link: http://msdn.microsoft.com/en-us/library/windowsazure/jj717232.aspx
Cloud Design Patterns: Prescriptive Architecture Guidance for Cloud Applications by Homer, Sharp, Brader, et al. Copyright 2014, Microsoft Patterns and Practices. This came out in Feb 2014 and is available in paperback (for a fee), or as a PDF (free download), or as a set of web pages. It contains 24 patterns, plus 10 guidance topics. There are also code snippets and samples provided as separate down loads. I really like this book/web site. See it at this link: http://msdn.microsoft.com/en-us/library/dn568099.aspx
I hope these references are as valuable to you as they have been to me!
dotnetsilverlightprism blog by George Stevens is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Based on a work at dotnetsilverlightprism.wordpress.com.












