Categories

Decorator Pattern Example

A while ago, I worked on a problem which is frequently encountered on some programming books and job interviews. The problem goes like this:

SALES TAXES

Basic sales tax is applicable at a rate of 10% on all goods, except books,
food, and medical products that are exempt. Import duty is an additional
sales tax applicable on all imported goods at a rate of 5%, with no
exemptions.

When I purchase items I receive a receipt which lists the name of all the
items and their price (including tax), finishing with the total cost of the
items, and the total amounts of sales taxes paid. The rounding rules for
sales tax are that for a tax rate of n%, a shelf price of p contains
(np/100 rounded up to the nearest 0.05) amount of sales tax.

Write an application that prints out the receipt details for these shopping
baskets…

INPUT:

Input 3:
1 imported bottle of perfume at 27.99
1 bottle of perfume at 18.99
1 packet of headache pills at 9.75
1 box of imported chocolates at 11.25

OUTPUT

Output 3:
1 imported bottle of perfume: 32.19
1 bottle of perfume: 20.89
1 packet of headache pills: 9.75
1 imported box of chocolates: 11.85
Sales Taxes: 6.70
Total: 74.68

The way the tax rate should be calculated for an item is determined on run-time, based on the properties of an item. This seemed like a problem domain that is suitable for using the decorator pattern. In this approach, I used a TaxCalculator that decorated the items on run-time, so that the items calculated their prices dynamically based on their properties. This is a simple domain model sketch to explain the idea:

Perhaps centralising the logic in a calculator object which would choose the strategy to be used to calculate the tax based on an item’s properties would normally come to one’s mind earlier than this approach but I think this approach is less boring :)

Using a decorator provided some more run-time flexibility (based on items in the Order) and also made it easier to extend and add new functionality by changing decorators’ behaviour or simply by adding more decorators. However, since the TaxCalculator uses the strategy pattern, it can later be replaced with a calculator that calculates the taxes itself after some refactoring.

The following class diagram describes how I used the decorator pattern to decorate Item objects:

You can find the source code at: http://github.com/erenay/sales-taxes-question

Running Redmine on Heroku

When pivotaltracker switched into a paying scheme, I decided to try Redmine on Heroku. It took more time than I expected and I wanted to share some important points that might help people, who would like to try the same configuration.
This might be far from being a complete guide because I didn’t take note of the whole process, I will do that next time ;)

Heroku documentation describes the overall process well. Heroku recommends installing gems using the Bundler application. But this caused problems with the redmine version 1.1.1 that I used. Heroku logs were displaying “Missing the Rails 2.3.5 gem”, “H10 – App crashed” errors.

Instead I defined my gems using the gem manifest file: .gems. Gems that I used and their versions that worked for me were:
builder (2.1.2), rails (2.3.5), rack (1.0.1), i18n (0.4.2)

Here is a discussion that contains the information that Rack 1.1.0 is incompatible with Rails 2.3.5 and older, and here is another one where using the 1.0.1 version is suggested.

I removed redmine’s vendor/rails directory in order to allow Heroku to manage the gems.

Also I added the following line to config/environment.rb file:
config.action_controller.session = { :key => "_myapp_session", :secret => "longStringOfCharacters" }

I believe newer versions of redmine will allow using the gem bundler and newer gem versions.

Bash Script for Partitial Data Dump to MySQL

I was required to create a “light version” of our database by having most of the tables empty and some tables having only first x entries. Those entries were specified in a csv file as around 5000 id values. Working on this involved using mysqldump iteratively in a loop. If I did not use mysqldump commands in a loop with fewer id values as arguments, mysqldump threw “Argument list too long” error.
Considering an advice from Dennis Williamson, I ended up with a script like this:

dump_filename=x`date +"%Y%m$d"`.sql
mysqldump -u root -pmysql --no-data curocomp | sed 's/AUTO_INCREMENT=[0-9]*\b//' > `echo $dump_filename`
ids=$(< ids.csv)
saveIFS=$IFS
IFS=','
array=($ids) # split into an array using commas as the delimiter
IFS=$saveIFS

count=0
for i in ${array[@]}; do array[$count]=$i","; ((count++));done # add commas back to each element
num=1000 # number of elements to process at a time
max=2000
for ((i=0; i<$max; i+=$num))
do
list=${array[@]:$i:$num}
# an excess trailing comma is stripped off in the next line
mysqldump -u root -pmysql myDB myTable --skip-add-drop-table --where="Id in ('${list%,}')" >> `echo $dump_filename`
done

xDDs

These are some notes that I took from a presentation made by Gojko Adzic at DDD eXchange 2010. This is a talk about how TDD, DDD and BDD complement each other in many aspects and how these conceps were applied together on previous projects that he was involved in.

An obvious commonality is between TDD and BDD that, they both focus on automating the control over the software project in order to reduce the cost of finding where the problems are. There is another commonality between DDD & TDD, which is they both use experimentation as a means for iteratively improving the domain model or the units under test. DDD plays around with models for finding the optimum and unit tests play around with the code, which has no actual implementors. Additionally, DDD & BDD share the aspects of collaborative development with a focus on the business. DDD demands us to collaboratively build the model with business and BDD takes this further by demanding collaboration with stake holders in defining requirements. By this way, business users can affect the development team and clarify the targets for them. Furthermore, with collaborative specifications, BDD can help the development team to cut scope, in order to meet requirements. Domain experts are not authorised to make decisions about scope.

DDD has the Core Domain concept which is about defining the parts of the model that brings business value. However if there are not enough details for measuring the business value, BDD allows to bring more details to the business aspects of the project with concepts such as Feature Injection [1] and value chains. With feature injections, we pull requirements into project incrementally, and these requirements change the amount of work, therefore the amount of time that we will take into account. Consequently, when alternative models or reference models are being considered for the model to be implemented, we don’t need to consider very long term specifications that are likely to change. We can focus on the time period, that is determined by the requirements pulled in for that iteration.

Design and development techniques complementing each other. Image taken from the presentation by Gojko Adzic


Ubiquitous Language concept of DDD ensures language consistency among team members, however as the project evolves, the terms of the ubiquitous language might gain different meanings for members of the team. Specification workshops can be used as a means of evolving the language. This can be done by using the terms of the ubiquitous language in the acceptance tests that are produced as a result of the specification workshops.

BDD can be very helpful for providing only enough level of detail to domain experts. It gives a higher level of understanding that might prevent exposing the domain experts to technical details that might not be necessary for them. These details should be the concern of the development team and they should be dealt with unit tests.

Domain experts do not need to know low level details of a software project. Acceptance tests used in BDD make it easier for business to understand

Gojko argued that depending on Emergent Design causes inconsistency in large teams, as not everybody is aware of how other members see the system and which approaches they prefer for refactoring.
DDD building block patterns (Entity, Value Object, Aggregate, Repository, etc.) comes in handy in these situations since it allows forming a common language of good practices. This reduces the amount of inconsistency that might be introduced by emergent design.
Gojko said that he doesn’t agree with the view that sees unit tests as something that keeps the system integrated during refactoring, because the unit tests that we have are tightly coupled with a particular design. When the design changes, all unit tests related to that class should be updated. BDD can help during refactoring by providing an invariant that specifies the functionality that should not change. I consider the fact that unit tests (and integration tests) break after changes are made in classes, as a helper for developers to put the system back into work, by pointing the locations where reviews and immediate updates are necessary, which would take much more time if the help of unit/integrarion tests were not available. The important point here is that, during these reviews and updates, business specifications should be used as a reference point for things that should not be altered.

During emergent design, there are times when the design should be captured (i.e. documented in a dynamical way) therefore a reference point for a common understanding of the system can be kept and newcomers to the project can be adopted easily. However the code itself is very low level for this purpose and UML diagrams get old very quickly and it can be costly to maintain them. If the model is reflected through acceptence tests and our tests are automated, we can use this as a Live Specification that explain the dynamics in the model.

During refactoring, it is common to have cases where some refactorings make cross-cutting changes in the system, causing conflicts in the code base. One might wish to have multiple contexts and make refactorings limited to only those relevant contexts, but avoiding cross-cuttings completely can be an unrealistic preference at many times. DDD brings the Bounded Context concept which advocates forming of boundaries for models and enabling communication among these context by making use of cooperation patterns, context mapping, change management protocols, so that the negative effects of refactoring that might be caused by cross-cutting concerns can be minimised.

As a result of these thoughts and as a summary, the following recipe for success was suggested by Gojko:
* Use strategic design to decide what to build
* Use feature injection for adjusting the scope for DDD
* Evolve and maintain a ubiquitous language with specification workshops
* Start with working on higher level domain design guided by business specifications, and establish guidelines
* Then start working on technical details with TDD
* Use context mapping for managing cross-cutting concerns

[1] Elizabeth Keogh, ‘Pulling Power: A New Software Lifespan’, http://www.infoq.com/articles/pulling-power

Misbehaviour with BDD

A general misconseption, which I also shared until Mauro Talevi’s presentation last week at Skills Matter, is the idea that Behaviour Driven Development (BDD) should only be used in completely agile environments. Mauro explained in his presentation that this is not necessarily true and he shared his experiences of applying BDD for a global investment bank for the past year.

BDD brings some concepts from Domain Driven Design (DDD) into Test Driven Development (TDD), such as having a Ubiquitous Language for bridging the divide between Business and IT. (It seems like the recent increase of “Driver candidates” for our design and development caused people outside the industry hate IT even more, as I heard one person say “DBB or BBD whatever…” rolling her eyes, before the event) . One of the main advantages of BDD is that, tests describe the behaviour. Additionally, acceptance tests can be executed. Mauro said that he sees BDD as synonimous to integration testing because it allows you to see the system from above the boundaries and from the eyes of the stakeholder by speaking the language of the business.

My first introduction to the concept of BDD was when I attended to Chris Parsons’ (CEO of Eden Development) BDD Workshop, which was held 1 day before the Rails Underground Conference ’09. In that worksop we experimented with Cucumber for applying BDD techniques. This time, I had a chance to see examples from a Java based BDD framework, named JBehave, and Mauro Talevi is also an active contributer to the core development of JBehave.

The example used during the presentation was a stock price monitoring system. The following test can be used to test the expected behaviour about when to notify a broker about the stock prices:

Given a threshold of 15.0
When a stock is traded at 15.5
Then trader should be alerted
When a stock is traded at 5.0
Then trader should not be alerted

Here, we use a grammar like: Given -> context, When -> event, Then -> outcome, And -> (repeat previous structure)

JBehave maps BDD steps to methods. For example, in a similar situation to the example above, our tests should be like:

// Given a threshold of x
@Given(“a threshold of $threshold”)
public void aThreshold(double threshold){
}

//When a stock is traded at y
@When(“a stock is traded at $price”)
public void aStockIsTraded(double price){
}

//Then a trader should be alerted
@Then(“trader should $beOrNot alerted”)
public void traderAlerted(String beOrNot){
}

JBehave allows you to have custom parameters by implementing the ParameterConverter Interface.

DateConverter implements ParameterConverter {
  public Object convertValue(String value, Type type) {
    // DateFormat injected in DateConverter
    return dateFormat.parse(value);
  }
}

The command line interface supports both Maven and Ant. They support any IDE that supports unit testing with JUnit. They also provide a BDD layer on any web testing API. He gave the following example for Selenium:

public class YourSteps extends SeleniumSteps {
  @Then(“there are $some messages”)
  public void someMessages(int some) {
    int count = selenium.getXPathCount(“//foo/bar”);
    assertEquals(some, count);
  }
  // more steps
}

We don’t have to show this peace of code to the stakeholder. The only sentence he/she needs to see should be: “Then there are x messages”. This way the functionality is much better communicated to the business.

Next, Mauro introduced the Scenario Web Runner, which provides a web interface for running generic scenarios. The fully functional example webapp that was shown during the presentation can be found here.

Case Study

The project that Mauro was involved in started in September 2008 at a global investment bank. It was a messaging and transaction management architecture . Initially the focus was on back-end, but later web front-ent development was also added. They used Scrum. The communication went well because they used scenarios to communicate behaviour. Since the behaviour was visible, this allowed the team to gain confidence.

When it comes to disadvantages, the fact that formats like CSV and XML do not refactor very well caused some problems in maintaining large data sets. For example, it may not always be easy to determine if an xml representation represents input or output data. Additionally, these formats are not very appropriate to show data to stakeholders.

A chart that shows the increase in the number of scenarios per 2 week sprints. (Image taken from the presentation slides)

Mauro also stated that, it is hard to determine the optimum way of verification. For example, capturing an expected output in a file may be more complete, but embedding into the scenario text may be more readable and refactorable. Another decision might be to capture the whole object, or only a field in that object. He also mentioned that it is hard to keep the development process behaviour driven all the time because the business side may not be available all the time or there might be misunderstandings caused by lack of enough business analysis. In these situations, the issues that were not covered due to lack of business analysis or another reason, can be catched at the next sprint, after discussing with the business side.

If there are too many repititions of similar steps in the scenario, the scenario data should be refactored and it should be reviewed to fit into a more suitable form for the business side. Other than repetition of scenario steps, this also applies to repetition of entire scenarios. Another leeson learned was that, pairing among developers, testers and business analyists lowers the risk of misunderstanding.

This presentation showed that BDD can play a key role in building trust between development and business sides and producing software that matters to the business.

Links

When to Use CouchDB

This article is based on George Palmer’s talk at the Rails Underground Conference, last week. George is the developer of the couch_foo plugin, which allows to interact with CouchDB from Ruby with an ActiveRecord style.

A CouchDB database is a collection of documents that represent objects with simple named fields. Documents are stored as JSON, and subsets of documents are handled via views. Views are dynamically built, and they are used for aggregating and reporting on the documents in a database. Different than schema based SQL databases, it works on semi-structured, document oriented data, which is the kind of data that most of collaborative web applications use. This structure allows adding new document types alongside the old. It has a peer based distributed architecture, which can use multiple CouchDB hosts having independent “replica copies” of the same database.

REST can be used as an interface to view the stored files. Using REST has some advantages like load balancing and caching. When we “view” a subset of documents, these views are saved as “_design/…” This document explains how to get the result set that we want to view.

A simple view function that shows the documents with a type of “van” is like this:

function(doc){

if(doc.Type == "van"){

emit(doc.Name, doc);

}

}

and this should return the documents with the “van” type in a JSON structure. The first parameter of the emit function should be the key value of the document, which is in this case the name property of the document. emit works by just storing the key/value pairs in an array and then, when all views in the same _design document have been calculated, returns all results at once

Associations in I querywere described with an example.

function(doc){

if( doc.type == "post"){

map([doc._id, 0], doc);

} else if (doc.type == "comment"){

map([doc.post, 1], doc);

}

If we think of a typical set of blog documents, where each post may be associated with many comments, with this function, I emit all posts with the value 0, and set the document id as the key value, and all comments are emited with the value 1, and the posts they belong to are set as their key values. When we view this setting by entering “_view/post_comments/all” to the browser, we have the following view:

"key":["1",0],  "value":{"_id":"1",  "type":"post",
"text":"My Blog Post"}
"key":["2",0],  "value":{"_id":"2",  "type":"post",
"text":"My 2nd Blog Post"}
"key":l"3",0], "value":{"_id":"3", "type":"post",
"text":"My 3rd Blog Post"}
"key":["l",l],  "value":{"_id":"3",  "type":"comment",
"text":"You rock dude", "post":"1")
"key":["2",l],  "value":{"_id":"3",  "type":"comment",
"text":"Han you suck", "post":"2"}

Here, we can see al the documents in JSON format, and associations of posts with relevant comments. If we want to see ony the post 1 and up to 2 comments associated with it, we should enter the following parameters: /all?startkey=["1"]&endkey=["1",2]

In Relational databases, the cost is generally paid at the point of insertion. With CouchDB, the cost is paid when checking a view for the first time. This means that, if the application is writing new entries to the database frequently, that might slow down the application.

CouchDB uses the _rev field in each document for conflict management. This field is used to determine the winning document when a conflict occurs.

George talked about FriendFeed as an example where using a schemaless database is a better solution. They moved to a schemaless structure on MySQL with storing only simple key/value pairs, because after some time it was too time consuming to build indexes with the amount of data stored in the DB.

The second example was one of George”s own project called 5ft Shelf. He explained how his initial DB schema gradually got highly complex. The following picture shows the difference between 2 approaches. The 2nd schema is for the CouchDB approach.

5ftshelf1 5ftshelf2

Places where CouchDB is not likely to be the best solution are when there needs to be fixed definitions and stored objects are unlikely to change, like estate agency applications where house properties have very strict and non-changing values, or financial applications that have very strict definitions.

The slides from the talk are here: http://www.slideshare.net/Georgio_1999/couch-foo-couchdb-on-rails

Sample chapters from the upcoming “CouchDb: The Complete Definition” book by O’Reilly: http://books.couchdb.org/relax/

George’s blog is: http://www.rowtheboat.com/

Video of the talk can be seen here: http://skillsmatter.com/podcast/ajax-ria/george-palmer-spending-more-time-on-the-couch

DDD eXchange 2009

DDD eXchange was last Friday. The Event started with Eric Evans’ keynote on strategic design and responsibility traps. Strategic design is covered by almost the last 40% of Evans’ book, and it was a good opportunity to hear an overview of Evans’ thoughts about the subject. Most of us know by experience know that, not all of any large system will be well designed. However, Evans argued that trying to make all parts of a large system equally well designed may lead the whole system become badly designed.

He gave an example of a company that wanted to switch its legacy system with a better designed one, using more recent technology. In most of the scenarios like this, which lead to fail, the switching process it planned to be accomplished in around 3 phases with around 1 year for each phase. These phases are building the infrastructure, transferring the legacy code over this structure and adding the new features over those layers in the final year. However, transferring of the legacy code generally does not turn into be as easy as it was thought to be. The complexities in the legacy code are there for a reason, and redesigning those complexities needs much more focus on the domain design.   Also it is possible that by the time the project reaches phase 3, people will have forgotten what was the main focus of the project, furthermore, the business will continue evolving and the some of the requirements will need to be changed.

Bad goal: Switching off the complex legacy system and building a nice, well organised system from scratch (Image taken from Evans' presentation)

Bad Goals

The first bad goal is completely switching off the legacy system. Evans stated that, unless the current system is highly expensive, re-designing it is a bad idea, because things will never be as perfect as an imaginary nice pyramid. Therefore, the agile kind of solution might come to mind as refactoring. This leads to think with the concepts of the old system while building the new system.

Hacking is another option that might come to mind. This means only adding the new features instead of changing the whole design. As a result, this adds more complexity into the system. This is also the approach that is generally followed after the two bad goals mentioned above fail. In the end, all the methods are likely to result in a final system that is similar to the one that was supposed to be replaced. According to Evans, all these approaches underestimate the importance of the design.

The Core Domain

There are Generic Subdomains, that do not require you to innovate something (i.e. Accounting). Supporting Subdomains, things that are specific to your business, and the Core Domain, which can not be bought off the shelf. It is important to understand the difference between the core domain and core features. For legacy systems with highly complex connections within it’s components, hacking seems the best approach for adding core features, because this approach leads to immediately working on the core.

Responsibility Traps

Evans describes irresponsible programmers as the ones who cause downstream cost and harm while not being recognised because they deliver the required features fast. It is a common mistake that development teams are based on a few irresponsible developers and some other developers who are supposed to clean the mess. As a result of falling into these traps, there are no sexy new capabilities, and hackers get well recognised while the design people are not regarded enough. Furthermore, leveraging weak programmers with sophisticated frameworks or platforms is an illusion. Those tools should be used to leverage your own team, who knows the system.

Context Mapping

Is it possible to just focus on the core domain? There are always multiple models. ‘Context mapping’ addresses the fact that different groups model differently. In a similar way to different people touching different parts of an elephant have different concepts about an elephant, different design people can have different models about different parts of a system and this is okay, because all those concepts can be bound together in a meaningful way to form a domain model. Evans suggests that we should not focus on building complete pictures of a system

Using an Anti-corruption layer for focusing on the core domain. (Image taken from Evans' presentation)

Another bad goal is to decide to use the same model in every team for every project. Using a ubiquitous language within an embounded context is the solution proposed by DDD. Here comes the term Context Mapping into practice. We should have a single and unified model within each context. Some of the contexts can be “big balls of muds”. Anticurroption layers are used to separate these contexts from the outside world. It is an interface design to support our model.

Anticurroption layers can be used to only focus on the core domain by proving the required functionalities from the legacy system. Practitioner Reports on the DDD website includes experiences of appliying DDD, along with a report about implementing an anticurruprion layer.

Evans listed the good goals as: stabilising the legacy system, clarifying the context map, deliver early, produce enthusiastic business sponsors, focus on the core domain and to create a platform.

Thick translation layers, inelegant (but stable) legacy and supporting domains are ugly, and should be avoided.

By taking these advices into account, you can be the hero for a change in your project. More information about strategic design can be found at part 4 of Evans’ DDD Book.

Context Mapping in Action

Next talk of the day was by Alberto Brandolini about Context Mapping. He explained the way he performs his profession, and his experiences of applying DDD in his previous projects. He also talked about various “mentors” who inspired the way he determined to approach problems, like Mr Wolf, Sun Tzu and Franco Begbie! He described each project as a limited resource game, where the resources are: Brain cells, time, developers and skills. Those are the limited resources other than money.

In the first scenario he explained, there was a freshly written legacy core however it was badly written. Furthermore, the analysis team was separated from development, and there were different development teams in different cities. Finally, it was forbidden to be in interaction with users. They started with producing an anti-corruption layer for the core domain and working on its features, however, after some time, because of the lack of effective communication between teams, contexts started to get into each other and it was hard to share a common vision. In this particular situation, exposing the truth about the architecture allowed the management of organisation to understand the problems and take some essential and necessary steps to prevent potential upcoming problems.

The Second “Strangelove” scenario was a large government project with a great development team where many actors were involved and the domain was not the primary focus. It was about connecting present components. Like the previous project, they did not have good conditions for applying DDD. They had to work with another team from a different company that was responsible for security and the boundaries between the two teams were not well defined. Soon after, the relation between  two teams turned into fighting. As solution, they defined an anti-corruption layer for their project. Then they started doing continuous integration while publishing their steps reliably. They focused on the other team’s needs and tried to form an effective partnership. 2 years later, the system was working perfectly but it was not doing the right thing, because the team was not focused on the domain enough.

OpenHack 2009, London

OpenHack London was a great opportunity for me to get introduced to a powerful set of web development tools by Yahoo and get inspired by the ideas of some talented hackers that were produced over a 24 hour period. Before starting hacking, Yahoo! Engineers gave presentations about the tools and technologies including BOSS, Fire Eagle, GeoPlanet, Pipes, YAP and YOS. All of the talks can be seen on Skills Matter’s web site [1]

Kizoom Team won the “Most Awesome Hack” prize with their robot finger that presses a button when it hears a loud noise. It is based on Lejos which allows programing LEGO ® robots in Java with a built-in Java virtual machine. Open Free Cycle, developed by Premasagar Rose and Tom Leitch using OpenMail, YQL and some other technologies won the “Hacker’s Choice” award, selected by the audience. It is an attempt to make community sharing for free. A list of all hacks presented in the event can be found here

[1] Yahoo! Application and Social Platforms

Martin Barnes – Yahoo! GeoPlanet. Exploring Places without Maps

Christian Heilmann – Remixing Web Data for Your Hacks

Ted Drake – Unlocking the Secrets of BOSS

Dav Glass – YUI3

Mike McKenna – PHP and Internationalisation

Also take a look at the following links:

A Very Personal Ramble Down Hackday Memory Lane

Sad Robot by Pornophonique, that gave a concert during the event

Martin Barnes – Yahoo! GeoPlanet. Exploring Places without Maps