Category Archives: Software Development

Items related to software development

Separating Concerns – Part 1: Libraries

Introduction

In large applications, particularly in enterprise applications, separation of concerns is critical to ease maintainability. Without proper separation of concerns, applications become too large and too complex, which in turn makes maintenance and enhancement extremely difficult. Separating application concerns leads to high cohesion, allowing developers to better understand code behavior which leads to easier code maintenance.

History

In the previous decade, architects designed applications using an n-tier approach, separating the application into horizontal layers such as user interface, business logic, and data access. This approach is incomplete, however, as it fails to address partitioning applications vertically. Unrelated concerns are commingled, resulting in a confusing architecture which lacks clearly defined boundaries and has low cohesion.

The other problem with an n-tier architecture is how it is organized from top to bottom, with the topmost layer being the presentation layer or user interface, and the bottommost layer representing the persistence layer or database. Instead of thinking of the architecture as horizontal layers, think of them as rings, as described by the Onion Architecture described by Jeffrey Palermo. (While Jeffrey proposed the pattern name, the architectural patterns have been defined previously by others.)

Separating Concerns

Given that a separation of concerns and increasing cohesion are the goals, there are several mechanisms towards achieving them. The solutions that follow include the use of libraries, services, and frameworks as ways to reach these goals.

The Library

A library is a set of functions used to build software applications. Rather than requiring an application to be a single project containing every source file, most programming languages provide a means to segregate functionality into libraries. While the facility name varies, a partial list of which includes package, module, gem, jar, and assembly, the result is enabling developers to separate functions physically from the main application project, improving both cohesion and maintainability.

Core, the new Manager

A library should not be a collection of unrelated functions, it should contain related functions so that it is highly cohesive. An application developer should be able to select a library for use based on its name and purpose, rather than having to pour through the source code to find the function or functions needed. A library should have a descriptive name and contain a cohesive set of functions towards a singular purpose or responsibility.

Creating a library named Core containing a large set of unrelated functions is separation of the sake of separation, and that library should not be treated as a library but as part of the application — it should not be reused by other applications.

Coupling (aka, the Path of Pain)

When an industry analyst shares their observations about code reuse in the enterprise, the findings indicate that actual code reuse is very low. A main reason that code reuse is so low is tight coupling. Coupling refers to how two libraries (or functions) rely on each other. When a library relies upon another library, the library relied on is referred to as a dependency. When an application relies on a library, it implicitly relies on the library’s dependencies as well. In many larger applications, this can lead straight to dependency hell.

Since tight coupling can lead to serious maintenance issues during an application’s lifecycle, limiting dependencies should be first and foremost in application and library design. If a function is to be moved from an application to a library, and that function must bring with it a dependency that was not previously required by the target library, the cost of adding the new dependency to the library must be considered. Too often, particularly in the enterprise where code is only reviewed internally by a single development team, poor choices are made when creating libraries. Functions are routinely moved out of the main project and placed into arbitrary libraries with little thought given to the additional dependencies of the library.

An Example

As an example, a web application has a set of functions for validating email addresses. The simplest validation methods may only depend upon regular expression functions, which are part of every modern language runtime used today. A more complete validation of an email address may check that the domain is actually valid and has a properly registered MX record in DNS. However, validating the domain involves sending a request to a service and waiting for the response indicating a valid domain before the email address is determined to be valid.

There are many things wrong in this example. First, the email validation function has a dependency on a domain validation function. Due to the fact that the set of valid domains is continuously changing, the domain validation function itself has a dependency on a domain name service. Of course, the domain name service depends upon a network domain name service, which may subsequently depend upon an internet service as well. By calling one library function, the application has managed to send a request to another machine and block a thread waiting for a response.

In the case of an error, the disposition of the email address is then unknown. Is it a valid email address that could not be validated due to a network error? Or is it a valid email address but flagged as invalid because the domain name could not be validated due to an internal DNS server not allowing external domains to be returned?

The coupling in the email validation library is clearly a problem, but what happens as the business requirements evolve over the life of the application? Consider the situation where new accounts are being created by spammers from other countries. To combat the spam accounts, email addresses must now be validated to ensure that the IP address originates from within the United States. The email validation function now has a new dependency, a geolocation service that returns the physical address of a domain. However, the service requires the use of separate endpoints for testing and production. The email address validation function is now dependent upon two services and configuration data to determine which service endpoint to use.

At this point, it is obvious that the complexity of validating an email address is not something that can be accomplished in a library function.

This article will continue with Part 2 on services.

StrangeLoop 2012

This past weekend I attended my 2nd StrangeLoop conference. StrangeLoop is an annual conference held in St. Louis, MO and for the last four years it has managed to draw some impressive talent. Unlike other events I attend, StrangeLoop is an independent conference and is not dominated by a single platform, technology, or language. The quality and level of content is also high, making StrangeLoop a place where introductory sessions are frowned upon — attendees want deep, intriguing sessions where experienced practitioners can learn new things. Attendees at StrangeLoop are commonly pushing the leading edge, and the session topics are state of the art, sometimes on the edge of redefining software development in the coming years.

So how was it?

Day 1

Opening Keynote: VoltDB, Michael Stonebraker

In the first thirty minutes, I had a strong sense that the conference was off to a rough start. In what was clearly a product-focused talk, the VoltDB CTO made a weak case for ACID, eliciting frequent groans from the audience. Make no mistake, Stonebraker is a really smart guy, but too much of his time was spent bashing other databases (if you can technically call eventually consistent storage systems without a query language databases). As an opening keynote for the conference, this was the worst possible choice. Now, I have followed VoltDB since the early bits, and was impressed with the lock-free approach that serializes all operations, but this talk didn’t spend enough time on the benefits of VoltDB.

Get a Leg Up with Twitter Bootstrap

For the first actual session of the day, Howard Lewis Ship took the audience on a tour of Twitter Bootstrap, which is rapidly becoming the File, New Web Site project template. In fact, I was glad to see that entire gallery of customized Bootstrap templates — hopefully now all Bootstrap originated sites won’t all look the same. I’m a fan of Bootstrap, and this was a solid introduction, but myself (and the rest of the audience I’m sure) was hoping for a bit more depth.

Software Architecture using ZeroMQ

My expectations were high on this session, and I was really hoping to get some insight into 0MQ, and how to build systems using it as the authors intended. While Pieter Hintjens provided some high-level coverage of ZeroMQ, I felt this session should have been called “Software Architecture 101″ and could apply to using any technology stack. I gained zero insight into ZeroMQ beyond what the executive summary already covered.

I was really starting to doubt my remaining session choices at this point, the first two were boring following a bad keynote. So I reached out to some friends to hear their experiences. This altered my scheduled for the rest of the day.

A Whole New World by Gary Bernhardt

This session was a short, light-hearted lunch session with a total Rick-roll ending. At least I ate my lunch, took a break, and make a couple of phone calls. I got blocked out on the Twitter Zipkin session due to space constraints, but I heard it was nothing special, so I glad didn’t miss anything.

Building an Impenetrable Zookeeper

Finally, an in-depth session given by a member of the team providing commercial support — if only I used Zookeeper. I understand what Zookeeper does, and the subject matter dealt with the type of issues organizations encounter trying to run it. I found this very interesting, particularly since I have a good understanding of distributed consensus and configuration — and this is not an easy nut to crack. I came away with some interesting notes that I’ll keep in mind when I create systems that either interact with Zookeeper, or perhaps when I create yet-another-open-source-project (Topshelf Bartender perhaps!). 

Graph: composable production systems in Clojure

What Jason Wolfe (of Prismatic, the news aggregator) offered up was a refreshing approach to building a functional container. Graph is comparable to Guice or Dagger, and provides a declarative approach to system composition. While at the lowest level it seems to offer the same features as an IOC container, the way it was presented and explained was really nice. I enjoyed this session, and took away a few notes for my own use. I also gained a greater fondness for Clojure, which was a recurring theme as the sessions continued.

The Database as a Value by Rich Hickey

So having done well with a Clojure talk, I decided to take in another one from the man himself, the author of Closure. The talk on Datomic was a nice realization that we are reaching a level where immutable databases are available and usable. Datomic is sweet, and how it handles IO and manages to spread the Live Index to multiple nodes for fast access is clever. I enjoyed this talk and look forward to seeing the ideas in Datomic shape a new wave of immutable storage systems (I’m not sure it’s a database, despite the intense conversation at the pre-party on that very subject). And again, an increasing appreciation for Clojure.

That’s how Day 1 ended for me, on a good note. So we went to Pappy’s BBQ and managed to snag one of the last remaining racks of ribs (apparently they sell out fairly early, while in line the chicken, turkey, and chopped brisket sold out). After dinner, we returned to the hotel to continue working through some code that I’d been toying with throughout the day (FeatherVane-related, if you were curious).

Day 2

Computer Like the Brain by Jeff Hawkins

This talk was almost scary. The depth of knowledge on the human brain is staggering. At one point, I saw a tweet suggesting that the Terminator himself was about to pop onto the stage and tell Hawkins to abandon his research for the sake of humanity. Yes, it was that scary. The way his company has built out models that match the human brain is impressive, and the results of some of their predictive systems were very close to reality. However, predicting the future is hard, and it’s easy to get it wrong. While many systems have promised to give us brain-like capabilities, most if not all of them have been limited in applicability or flat out failed when generalized. I suppose that is actually good for us (mankind).

Y Not? Adventures in Functional Programming

Jim Weirich is pretty well known (well, apparently I don’t know anybody — a joke that never ended during the conference) and he took the audience on a ride using Clojure to explain the Y-Combinator. When the talk started, he promised a fun ride that would likely be inapplicable to anything any of us does in our daily jobs. And he was right, it was fun! Live coding works when the presenter can do it and do it well, and this was a great session. Very enjoyable, the day was off to a great start!

Runaway compexity in Big Data and a plan to stop it.

Last year, Nathan Marz open-sourced Twitter Storm during his session at StrangeLoop 2011, and it was an impressive system (written in Clojure, big shock). The real-time analytics capabilities of Storm are slick, and it sounds like it’s only gotten better over the past year. I was hoping for great things again this year, however, what I found was a bit of a reminder of a talk in 2008. At QCon San Francisco in 2008, Greg Young gave a talk about Unleashing your Domain Model, covering how insert-only data stores, event sourcing, and real-time projection of data into views can benefit real-time applications. It seems like even today these ideas are flowing through the minds of the real-time web properties.

Eventually Consistent Data Structures by Sean Cribbs

This was an eye-opening talk about newly defined data structures that enable concurrent updates that are eventually consistent. As more distributed systems are being built, the ability to perform concurrent updates on records that resolve conflicts easily is needed. As a big fan of algorithms, I found the way these data structures were assembled very interesting — despite their very specific purpose. I had originally planned on attending Oleg Kiselyov’s talk on Guessing Lazily, but the presenter was spending too much time flipping through random snippets of code that was very hard to follow, making my ability to grasp what was being done difficult. Which is a bummer, because I saw quick segments of parser combinator code, which I rely on heavily in my parsers.

Taking Off the Blindfold

This talk was awesome, and Bret Victor had people cheering. The flow of his presentation where he shared with us his vision for a dynamic, interactive IDE had some developers just screaming for more. The high point for me was one of my favorite childhood memories — taking the entire bin of Legos and dumping it out on the floor. By getting everything out in front of you, you needn’t think about what you’re going to build in a vacuum, you can see, touch, and draw items from the random chaos laid out before you. Some of the ideas here seemed to redefine what should be expected of an IDE.

The State of JavaScript

Yes, Brendan Eich, the inventor of JavaScript, laid out the awesome coming in ECMAScript 6. Some of the proposed features are awesome (and strangely enough available in the nightly FireFox builds), while some features have me concerned. CoffeeScript has clearly influenced some features, and I sensed a subtle Microsoft influence in some of the language and keyword choices. I was glad to see byte code clearly off the table, but disappointed to see macros up for possible inclusion. Brendan is an incredible presenter, and you could hear the passion in his voice.

With that, the conference was wrapped. I had a great time, had some great conversations, and really enjoyed some of the sessions. It’s great to be able to take the time to attend, listen to, and appreciate content once in a while without worrying about my own presentation. If you can make the time next year, and the content looks good, I highly recommend StrangeLoop!

 

 

The Right Tool, The Right Time

Over the past few months I have been reviewing many of the products I was involved in creating, both as a developer and an architect, and have assembled an inventory of the technology and architecture used. With a catalog of products spanning more than eighteen years, a diverse set of architectural styles are represented. On one end of the spectrum are client/server systems deployed on-premise and on the opposite end are software-as-a-service (SAAS) browser-based products. Most of these products are line-of-business systems and include both heavy user interaction and background data processing. In fact, two separate products offer a similar feature set targeted at the same market but sit on opposite ends of the architectural spectrum. The first product was built in the 90′s and is a client/server system, the latter was built more recently during the SAAS era targeting the web.

What follows are a few of the common design choices that I encountered, with my take on how appropriate that same choice would be today.

Data Storage

As I looked into each product, I examined how various requirements were addressed given the tools available at the time. For example, I didn’t question the use of a flat file to store reference data in the early client/server products since flat files were perfectly acceptable at the time. However, this led me to question some design choices when looking at SAAS products — including some choices that you might not expect. For instance, why is a flat file not an acceptable design choice for a system developed today? The data is still the same reference data, yet current guidance would suggest this reference data be stored in a database, most likely a relational database.

Is this because developers have become too lazy to write the component to read the file? Surely not, since a component will have to be written to import the reference data into the database. While it can be done fairly easily using database tools, the process still has to be scripted out and repeatable in case the import needs to be repeated on a new database.

Let’s enhance the problem and add a time dimension to the reference data, making updates available every thirty days. Now, not only is an initial import needed, but the import component will also need to support updating the database with the new content. Again, this could be done using database tools — a simple truncate table and repeat the import process. But what if developers have created relationships between the reference data table and other tables in the system? What if those relationships were created using the row id instead of the appropriate business identifier? At that point, the table cannot be simply truncated and the update process must now perform a complete delta of the existing and updated data sets and merge the changes into the database. That certainly doesn’t sound lazy — if anything, it sounds downright painful.

Another question that came to mind when using a relational database to store reference data was “which database?” Now, if the first answer that popped into your head when you read that was “SQL Server,” or even worse “the database,” therein lies the real problem.

A product is not just an application, it is a system composed of one or more applications, multiple components, multiple services, and multiple databases. Consider the earlier example that used a flat file to store reference data. The flat file itself is a separate database. In a system of any complexity there are many different sets of reference data, all of which are stored in their own separate flat files. Therefore, the system has multiple databases, each using the appropriate technology based on how that database is used.

If the reference data had remained in a flat file, when the flat file was updated with the new reference data, the original file is simply replaced and the system continues. No special import or update process is required.

Nested Object Graphs

Another common design I saw, particularly in products that manage a revolving set of accounts, was the use of a deeply nested object graph that is persisted in a relational database. As accounts were accessed, the entire object graph would be loaded from the database and presented to the user. Once the user made whatever changes were necessary at the time, the account was then saved to the database. In order to save the object graph, the nodes at each level in the graph are compared with the database, and deltas are generated to update the database tables.

In early examples of this design, a pessimistic locking system was implemented to track user activity and prevent multiple users from working on the same account at the same time. This was common in the client/server products, since even at that time record locking using ISAM files (or even network file locking) was fairly problematic.

As products moved to the web, a more optimistic locking strategy was used. I found two different conflict resolution methods, the first of which used a timestamp to track modifications to an account. If an update was received and the timestamp didn’t match, the later update was rejected. The second method was “last write wins,” updating the account to whatever was in the later update — possibly and quite commonly losing previous updates from other users. This got real interesting when two updates were performed at the same time.

Neither of these solutions make sense today for SAAS applications. In an environment where multiple users may be interacting with an account at the same time, it’s more important to look at providing users with a task-based user interface that captures the intent of each action on an account. For example, loading an entire account just to change the billing address creates unnecessary data movement that can limit throughput (read: scalability concern). At the same time, preventing a user from adding a charge to an account because another user slipped in behind you to update the phone number creates an unnecessary user burden. If updating the billing address, updating the phone number, and adding a charge to an account were explicit actions (read: commands) that can be performed on an account, they could all be performed simultaneously without conflict.

Note that the Command-Query Responsibility Segregation (CQRS) or even just Command-Query Separation (CQS) architectural styles specifically addresses this type of design.

Stored Procedures

In the example above, a deeply nested object graph was loaded from the database. In a system designed today, a developer would most likely reach for an object-relational mapper (ORM) to deal with loading and saving the object graph to the database. There are many to choose from (Hibernate, NHibernate, and Entity Framework are a few) and they solve the problem of binding object graphs to relational database tables very well. In fact, most ORMs today can generate the DDL needed to create the database objects as well — eliminating the need to write table creation scripts by hand.

At this point, I can hear the blood pressure of many database administrators reading this rising through the roof. With SQL book in hand and years of experience writing stored procedures full of selects and cursors, the story of how a hand tuned stored procedure that returns a sequence of forward-only record sets in a single round trip to the database server is the only way the scalability requirements of the application can be met. I’m not saying that using a stored procedure in this situation is wrong, but making a stored procedure the first tool you pull out the toolbox is very wrong indeed.

Why is it wrong? Creating a stored procedure to read data as the first approach is wrong because it is an optimization. Optimizing components of a system before that particular component has been identified as a bottleneck will lead to increased complexity, and that complexity will breed quickly in the project. And as complexity increases across the project, long term maintainability suffers as the capabilities of the development team are challenged. Yep, you guessed it, the stored procedure first approach is a classic case of premature optimization.

How does using a stored procedure in this way breed complexity? First of all, it establishes a myth that reads are a problem. As functionality is added to the system, developers who have come to believe that any account related reads must be done with a stored procedure else they become responsible for performance inadequacy, create more read procedures. As features continue to be implemented, more data elements are added to the schema, requiring every stored procedure to be updated as the schema changes — creating more work for developers who must now touch features that were complete and tested to ensure they still operate as expected.

The opposite effect of the read myth is that retrieving the entire object graph for an account is so well optimized that it is better to load the entire object and use only the needed data elements rather than create a new read procedure. With an ORM, this is handled very well using projections and fetching strategies. Developers can use the ORM to read a partial object graph, returning on the required data elements and reducing the data movement between the database and application server.

All of this accidental complexity was created based on the superstition that only a stored procedure would be fast enough to support the scalability needs of the product. An optimization that was implemented before a bottleneck was identified.

Considering that most ORMs today are capable of writing very efficient SQL and have dialects specifically tuned for each database platform, the read performance of the ORM is less likely to be a system bottleneck. For example, with Microsoft SQL Server, NHibernate takes advantage of batch queries with ADO.NET to reduce the number of round trips between the database and application servers. The SQL generated is also parameterized, allowing the SQL engine to cache execution plans for better server performance. Given these optimizations have already been done by the ORM, tuning read performance in the database is not likely to create the biggest benefit in system scalability. For example, caching of already loaded objects will likely result in greater overall read performance.

Did I forget to mention that this early decision tightly coupled the product to using a particular database platform? SQL dialects are hardly portable between platforms, so the product now has to decide if it will work with a single platform or create a separate release branch for each database platform supported. The better ORMs support multiple server dialects, including Microsoft SQL Server, Oracle, MySQL, PostgreSQL, and many others.

I said I wouldn’t argue the performance difference between using an ORM and a stored procedure. I will point out, however, that using a stored procedure to tune performance is an optimization for a particular environment and should not be an early choice in system design. Going straight for the stored procedure without considering less complex options is another case where a lot of times, the tool we used yesterday is not always appropriate for a system being designed today.

To Be Continued…

Above I’ve covered a few of the design choices made early in the development of several major products and how that affected the evolution of the product over time as featured were added. I also applied a modern view of how many of the choices we made before all these “great tools” were available are not necessarily bad today. As I get more time, I hope to share a few more stories with you as I undercover them in what has basically become a “career retrospective” for me.

 

MVP Award Renewed for 2010

Today I was honored for the second time with the Microsoft MVP award. It’s great to be recognized for my efforts in the .NET community over the past year. The next year is already shaping up to be another great one, with upcoming speaking engagements at Dallas TechFest, Devlink (Nashville, TN), St. Louis Day(s, plural) of .NET, and the Heartland Developers Conference in Omaha, NE.

If you are near any of these great events, I hope you are able to attend, learn a few things, and most importantly meet others that are part of the software development community. I also would encourage you to attend a few sessions outside of your regular development platform to get an idea of how other technologies solve the same problems in their own way. The cost to value of all these events is an absolute bargain, and many have early registration discounts that are only good for a limited time, so be sure to get registered to ensure the best price.

I look forward to meeting some of you over the next few months, so if you are at one of my talks or see me in the hall, be sure to introduce yourself and give a shout out.

(word)

Heartland Developers Conference 2008

This past week I had the pleasure of attending the Heartland Developers Conference in Omaha, Nebraska. I was already in town visiting family and decided to take a day to see what the local flavor had to offer. I’m particularly grateful to Joe Olsen of PhenomBlue for allowing me to register. I was originally hoping to secure a spot as a presenter, but the list apparently fills up in early March so next year I’ll try to plan for it.

On Wednesday night, a pre-party for attendees was held at the Qwest center. I rolled into the event around 9:00 PM (after watching the first part of the debate) and introduced myself to a few of the people there. Some of the folks I met included Jason Bock, Chris Williams, Joe Olsen, Amanda Laucher, Jeff Julian, and John Alexander. It was nice since there were some drink tickets being passed around and Rock Band 2 was going in the corner. I talked with several of the folks there and left looking forward to the content that was on tap for the coming day.

The morning of the event there was a lot of time to mill about before the sessions got underway (particularly since I got a ride from my mom and was there at 6:45 AM). Early on I found Joe Stagner (blog) and chatted about his recent hard drive upgrade amongst other things. I later met Clint Edmonson, an architect evangelist with Microsoft and discussed the content of the session he was delivering that day. There were a lot of good sessions at overlapping times so I wasn’t able to attend them all unfortunately. Once I settled down for the keynote next to Jason Bock, I caught up on some email and listened to Joe’s presentation.

My first session of the day was Jason Bock (blog, twitter) who spoke on Reflection and IL, including using Cecil for some post-build IL weaving. Very interesting topic for me and a great presentation. I spoke to Jason the night before to see if he had looked at any of the new expression tree tools in 3.5 and how to use them to build code on-the-fly without using Emit. It was an interesting discussion, Jason is a smart guy.

When it was time for lunch, a few of us went over to Farrell’s Bar and had some nachos and a burger. I found it funny since I was the only one from the south, I was also the only one who enjoyed the jalepenos. We chatted in general about community involvement and various events where we all had attended/presented and overall was a great discussion.

After lunch, Drew Robbins (blog) presented on the Microsoft Extensibility Framework (MEF, web). Despite a stuffy head that made MEF sound like “METH” I got a lot of great information and look forward to learning more about this new framework. The way you specify exports and imports really makes it easy to define an extensible application. I’m certainly going to look at ways to use MEF with MassTransit in order to provide a new way to compose services that consume messages.

I had to take a rest for a while, chatting with the Geeks With Blogs guys for a while and generally just taking it easy. After bouncing around I settled into Chris Williams’ (blog, wrap-up) talk on XNA. I’ve never looked at the game development tools for Windows, but I got a brief into as to what to expect. Once this session was up, it was break time. I was planning on doing a podcast with the Geeks With Blogs guys but was trumped when Joe Stagner and Amanda Laucher sat down to do a joint session.

The day was great and I learned a few things. I also got to meet some great people and had some interesting conversation. I had hoped to attend the F# presentation by Amanda Laucher (blog) and the Open Source presentation by Javier Lozano (blog), but I wasn’t able to return for day two of the event. The time on the day I went was well spent and I look forward to attending HDC again in 2009.

Prelude to Tulsa TechFest 2008

In two weeks, the 2008 installment of Tulsa TechFest will be upon us. For two days, Tulsa is going to unleash an impressive array of sessions on all aspects of IT, security, and software development. As I review the broad list of presenters I can’t help but see conflicting sessions where I’m going to have to make some tough choices.

If you are going to be anywhere near the Tulsa area and can manage to slip away from work for a couple of days I highly recommend making an appearance. The breadth of learning opportunities at the unbelievable price ($2/day) make this an incredible way to learn some new skills and sharpen your existing ones.

I will be presenting in two sessions this year. The first session will be on building distributed application using MassTransit (co-hosted by Dru Sellers) and other open-source frameworks for .NET. This is doing to be a deep view on how to build loosely-coupled systems on top of a messaging service (in this case, MSMQ). Advanced topics include asynchronous messaging and sagas (long-lived transactions).

The second session will be an introduction to iPhone development. I’m not a seasoned expert here, but I’m impressed with the platform provided by Apple (Xcode) free of charge for building applications for Mac OS X and the iPhone. This introduction will cover the tools and application structure for building iPhone applications in Objective-C.

If you happen to see me there, feel free to stop me and say hello.

Assert.That(this, Is.Easy);

I came up with this a month or two ago, but finally decided to share it. While working on Mass Transit, I was joking with Dru Sellers about how nice it was to have really good test coverage when making design changes to some all-new development code. I’ve had very limited opportunity for a completely new projected started purely from unit tests, so I was just impressed at how easy it was to make code changes knowing that a passing set of tests meant all was well in the world.

You see, not all parking lots are paved with quality asphalt, generally flat, and void of any obstructions like islands and lights (see my other hobby). At work, our application is a lot of vintage C++ code, a ton of stored procedures packed to the hilt with domain logic, and nearly zero percent unit test coverage. Since adapting agile development, it is something that has been missing from our process. In our latest iteration, we’ve started using unit tests (with NUnit) to design our interfaces and classes. At the same time, we’re integrating Mass Transit to support the loosely coupled layer of application services (which include object translation, communication with high-latency remote systems, and lazy auditing of transactions). Aside from a few basic web services to support remote client application support tools, this is the first C#/.NET development that is being done as part of the main application.

So back to our story, my first project with really good test coverage exposed me to a lot of new things. From a TDD perspective, I’d read about it, used it to build some basic tests for various classes, and thought I had a pretty decent understanding of it. In this new project, I also learned how to use Rhino.Mocks (which took the test run time from 40-50 seconds down to 1.83 seconds on average), a very powerful tool for making an interface behave as you would expect an implementation of that interface to behave. The use of mocks has really helped me focus on actually writing tests and building a single class at a time. Prior to using mocks I would jump around creating additional classes as I defined new interfaces just to be able to continue writing my unit tests on the original class. By using a mock, I’m able to simulate the behavior of the other class without losing focus.

As my appreciation for TDD grew, I jokingly dropped a slogan into a chat window (using Skype, of course, aren’t you?):

Assert-That-This-Shit.png

I got a few chuckles, and thought it would make a great t-shirt to wear to tech events like code camps. So I threw together a quick online store so that I could order one for myself. I showed it to a few others (like Joe Ocampo, who suggested the slightly less offensive, yet subtly more suggestive variant) and decided to make it available to anyone that wanted one. So if you like it, grab one for yourself and maybe I’ll see you wearing it at ALT.NET Seattle!

Iteration One is a Wrap

Over the past two weeks, my department has been working on our first iteration using agile practices. Yesterday, we wrapped up with a retrospective to go over our progress. We used a fish bowl to keep the conversation centered and focused — a method that once again proved to be useful for controlling a discussion without controlling the discussion.

We setup a whiteboard with columns for the following topics:

Start
Things that we should start doing on the next iteration.

Continue
Things that we should continue to do every iteration.

Stop
Things that we should stop doing.

Debt
Things that we did (or didn’t) do that will contribute to our technical debt.

We started with an introduction to the retrospective, a declaration of our goals, and a quick recap of how the fish bowl works. We also identified a remote advocate — a single person who is responsible for coordinating communication with our remote team members. Our company uses Live Meeting for conferencing, so we explained how to use the seating chart and how to use the Raise Hand feature. The advocate also had their IM client up for any out-of-band questions or issues with the meeting client. Once that was up and running, we opened the discussion.

Some of the topics discussed:

  • Start making sure the acceptance criteria are well defined before starting the story.
  • Start pairing throughout the development of the engineering tasks and not just at the end for review
  • Start keeping an audit trail of initials of people who worked on a story or an engineering task
  • Stop putting incomplete stories into the backlog
  • Continue the daily stand up meeting format

There were many more, but you can see how the structure worked. In all, we identified around 20 items that we need to either start, stop, or continue.

Once that segment of the meeting was over, we went over some of the methods being used to track things like burn down. Our project manager (whom we have yet to designate with a more appropriate agile title) went over the spreadsheet she uses to track story points, engineering task estimates, and actual hours worked on each task. She then showed some web sites from other groups in the company doing Scrum and how they had organized their Wiki, how they posted pictures of their planning board and burn down chart, and their honorary stuffed ScrumMaster.

We did have a few bumps towards the end of the iteration with our test environment and the number of defects coming back from testing (which is why we want to start pairing earlier). We hope to improve with each iteration (of course) but for our first lap around the track I think we did pretty well!