Wednesday, May 27, 2020

Domain Driven Design Chapter 5 Summary

Chapter 5: A Model Expressed in Software
    • "Connecting model and implementation has to be done at the detail level."
  • Associations
    • "For every traversable association in the model, there is a mechanism in the software with the same properties."
    • "In real life, there are lots of many-to-many associations, and a great number are naturally bidirectional. The same tends to be true of early forms of a model as we brainstorm and explore the domain. But these general associations complicate implementation and maintenance. Furthermore, they communicate very little about the nature of the relationship."
    • Three ways of making associations more tractable
      • Imposing a traversal direction
        • "The United States has had many presidents, as have many other countries. This is a bidirectional, one-to-many relationship. Yet we seldom would start out with the name "George Washington" and ask, "Of which country was he president?" Pragmatically, we can reduce the relationship to a unidirectional association, traversable from country to president. This refinement actually reflects insight into the domain, as well as making a more practical design. It captures the understanding that one direction of the association is much more meaningful and important than the other. It keeps the "Person" class independent of the far less fundamental concept of "President."
      • Adding a qualifier, effectively reducing multiplicity
        • "Very often, deeper understanding leads to a "qualified" relationship. Looking deeper into presidents, we realize that (except in civil wars, perhaps) a country has only one president at a time. This qualifier reduces the multiplicity to one-to-one, and explicitly embeds an important rule into the model. Who was the president of the United States in 1790? George Washington."
      • Eliminating nonessential associations
        • "Of course, the ultimate simplification is to eliminate an association altogether, if it is not essential to the job at hand or the fundamental meaning of the model objects."
  • Entities (A.K.A. Reference Objects)
    • "Many objects are not fundamentally defined by their attributes, but rather by a thread of continuity and identity."
    • "many things are defined by their identity, and not by any attribute. In our typical conception, a person [...] has an identity that stretches from birth to death and even beyond. That person's physical attributes transform and ultimately disappear. The name may change. Financial relationships come and go. There is not a single attribute of a person that cannot change; yet the identity persists."
    • "Some objects are not defined primarily by their attributes. They represent a thread of identity that runs through time and often across distinct representations. Sometimes such an object must be matched with another object even though attributes differ. An object must be distinguished from other objects even though they might have the same attributes. Mistaken identity can lead to data corruption.
    • Modeling Entities
      • "[T]he most basic responsibility of entities is to establish continuity so that behavior can be clear and predictable. They do this best if they are kept spare. Rather than focusing on the attributes or even the behavior, strip  the entity object's definition down to the most intrinsic characteristics, particularly those that identify it or are commonly used to find or match it. Add only behavior that is essential to the concept and attributes that are required by that behavior. Beyond that, look to remove behavior and attributes into other objects associated with the core entity. [...] Beyond identity issues, entities tend to fulfill their responsibilities by coordinating the operations of objects they own."
    • Designing the Identity Operation
      • "Each entity must have an operational way of establishing its identity with another object -- distinguishable even from another object with the same descriptive attributes. An identifying attribute must be guaranteed to be unique within the system however that system is defined -- even if distributed, even when objects are archived."
  • Value Objects
    • "Tracking the identity of entities is essential, but attaching identity to other objects can hurt system performance, add analytical work, and muddle the model by making all objects look the same.

      "Software design is a constant battle with complexity. We must make distinctions so that special handling is applied only where necessary.

      "However, if we think of this category of object as just the absence of identity, we haven't added much to our toolbox or vocabulary. In fact, these objects have characteristics of their own and their own significance to the model. These are the objects that describe things."
    • "When you care only about the attributes of an element of the model, classify it as a value object. Make it express the meaning of the attributes it conveys and give it related functionality. Treat the value objects as immutable. Don't give it any identity and avoid the design complexities necessary to maintain entities."
    • Designing Value Objects
      • "We don't care which instance we have of a value object. This lack of constraints gives us design freedom we can use to simplify the design or optimize performance. This involves making choices about copying, sharing, and immutability."
      • To safely share or copy value objects, they must be immutable.
      • You can choose either to share or to copy, depending on what constraints you are trying to optimize for.
    • Designing Associations that Involve Value Objects
      • Bidirectional associations between two value objects make no sense.
      • Without identity, it is meaningless to say that an object points back to the same value object that points to it.
      • "Try to completely eliminate bidirectional associations between value objects. If in the end such associations seem necessary in your model, rethink the decision to declare the object a value object in the first place. Maybe it has an identity that hasn't been explicitly recognized yet."
  • Services
    • "Some concepts from the domain aren't natural to model as objects. Forcing the required domain functionality to be the responsibility of an entity or value either distorts the definition of a model-based object or adds meaningless artificial objects."
    • A service is an operation offered as an interface that stands alone in the model, without encapsulating state, as entities and value objects do.
    • A good service has three characteristics:
      • The operation relates to a domain concept that is not a natural part of an entity or value object.
      • The interface is defined in terms of other elements of the domain model.
      • The operation is stateless.
    • Services and the Isolated Domain Layer
      • Services are used in other layers than just the domain layer.
      • "It takes care to distinguish services that belong to the domain layer from those of other layers, and to factor responsibilities to keep that distinction sharp."
    • Granularity
      • The service pattern can be used as a mean of controlling granularity in the interfaces of the domain layer.
      • "Medium-grained, stateless services can be easier to reuse in large systems because they encapsulate significant functionality behind a simple interface."
    • Access to Services
      • "Distributed system architectures [...] provide special publishing mechanisms for services, with conventions for their use, and they add distribution and access capabilities. [These] architectures should be used only when there is a real need to distribute the system or otherwise draw on the framework's capabilities."
  • Modules (A.K.A Packages)
    • "Everyone uses modules, but few treat them as a full-fledged part of the model. Code gets broken down into all sorts of categories, from aspects of the technical architecture to developers' work assignments. Even developers who refactor a lot tend to content themselves with modules conceived early in the project.

      "It is a truism that there should be low coupling between modules and high cohesion within them. [...] There is a limit to how many things a person can think about at once (hence low coupling). Incoherent fragments of ideas are as hard to understand as an undifferentiated soup of ideas (hence high cohesion)."
    • "Choose modules that tell the story of the system and contain a cohesive set of concepts."
    • "Give modules names that become part of the ubiquitous language. Modules and their names should reflect insight into the domain."
    • Agile Modules
      • "Modules need to coevolve with the rest of the model. This means refactoring modules right along with the model and code."
      • This refactoring often doesn't happen because there are many difficulties in refactoring modules.
      • "Whatever development technology the implementation will be based on, we need to look for ways of minimizing the work of refactoring modules, and minimizing clutter in communicating to other developers."
    • The Pitfalls of Infrastructure-Driven Packaging
      • Many frameworks encourage package structures that reflect the infrastructure.
      • This can lead to a business object being split across multiple packages, causing low cohesion.
      • This can also lead to packages that have a high number of calls between them, causing high coupling.
      • "Use packaging to separate the domain layer from other code. Otherwise, leave as much freedom as possible to the domain developers to package the domain objects in ways that support their model and design choices."
  • Modeling Paradigms
    • Why the Object Paradigm Predominates
      • Object modeling strikes a nice balance of simplicity and sophistication.
      • It also has significant circumstantial advantages deriving from maturity and widespread adoption.
      • It has a mature developer community.
    • Nonobjects in an Object World
      • "Whatever the dominant model paradigm may be on a project, there are bound to be parts of the domain that would be much easier to express in some other paradigm."
      • "When there are just a few anomalous elements of a domain that otherwise works well in a paradigm, developers can live with a few awkward objects in an otherwise consistent model."
      • "But when major parts of the domain seem to belong to different paradigms, it is intellectually appealing to model each part in a paradigm that fits."
      • "[M]aking a coherent model that spans paradigms is hard, and making the supporting tools coexist is complicated."
    • Sticking with Model-Driven Design When Mixing Paradigms
      • "Without a seamless environment, it falls on the developers to distill a model made up of clear, fundamental concepts to hold the whole design together."
      • "The most effective tool for holding the parts together is a robust ubiquitous language that underlies the whole heterogeneous model."

Tuesday, May 26, 2020

Domain Driven Design Chapter 4 Summary

Part II: The Building Blocks of a Model-Driven Design
  • "Developing a good domain model is an art. But the practical design and implementation of a model's individual elements can be relatively systematic. Isolating the domain design from the mass of other concerns in the software system will greatly clarify the design's connection to the model. Defining model elements according to certain distinctions sharpens their meanings. Following proven patterns for individual elements helps produce a model that is practical to implement."
  • "Elaborate models can cut through complexity only if care is taken with the fundamentals, resulting in detailed elements that the team can confidently combine."
Chapter 4: Isolating the Domain
    • "The part of the software that specifically solves problems from the domain usually constitutes only a small portion of the entire software system, although its importance is disproportionate to its size. [...] We must not be forced to pick them out of a much larger mix of objects [...] We need to decouple the domain objects from other functions of the system, so we can avoid confusing the domain concepts with other concepts related only to software technology or losing sight of the domain altogether in the mass of the system."
  • Layered Architecture
    • "In an object-oriented program, UI, database, and other support code often gets written directly into the business objects. Additional business logic is embedded in the behavior of UI widgets and database scripts. This happens because it is the easiest way to make things work, in the short run.

      "When the domain-related code is diffused through such a large amount of other code, it becomes extremely difficult to see and to reason about. Superficial changes to the UI can actually change business logic. To change a business rule may require meticulous tracing of UI code, database code, or other program elements. [...] [A] program must be kept very simple or it becomes impossible to understand."
    • Most successful architectures use some version of these four conceptual layers:
      • User Interface (or Presentation Layer)
        • Responsible for showing information to the user and interpreting the user's commands
      • Application Layer
        • Defines the jobs the software is supposed to do and directs the expressive domain objects to work out problems.
      • Domain Layer (or Model Layer)
        • Responsible for representing concepts of the business, information about the business situation, and business rules.
      • Infrastructure Layer
        • Provides generic technical capabilities that support the higher layers.
    • "Partition a complex program into layers. Develop a design within each layer that is cohesive and that depends only on the layers below. Follow standard architectural patterns to provide loose coupling to the layers above. Concentrate all the code related to the domain model in one layer and isolate it from the user interface, application, and infrastructure code. The domain objects, free of the responsibility of displaying themselves, storing themselves, managing application tasks, and so forth, can be focused on expressing the domain model. This allows a model to evolve to be rich enough and clear enough to capture essential business knowledge and put it to work."
    • Relating the Layers
      • Layers should be loosely coupled, with dependencies only in one direction.
      • Upper layers can use or manipulate elements of lower ones by calling public interfaces
      • When a lower level needs to communicate upward (beyond answering a direct query), patterns such as callbacks or observers should be used.
    • Architectural Frameworks
      • "When infrastructure is provided in the form of services called on through interfaces, it is fairly intuitive how the layering works and how to keep the layers loosely coupled."
      • Some technical problems call for more intrusive forms of architecture.
      • Architectural frameworks often require the other layers to be implemented in very particular ways.
      • "The best architectural frameworks solve complex technical problems while allowing the domain developer to concentrate on expressing a model. But frameworks can easily get in the way, either by making too many assumptions that constrain domain design choices or by making  the implementation so heavyweight that development slows down."
      • "A lot of the downside of frameworks can be avoided by applying them selectively to solve difficult problems without looking for a one-size-fits-all solution."
  • The Domain Layer is Where the Model Lives
    • "The "domain layer" is the manifestation of [the domain model] and all directly related design elements. The design and implementation of business logic constitute the domain layer. In a model-driven design, the software constructs of the domain layer mirror the model concepts."
  • The Smart UI "Anti-Pattern"
    • "Many software projects do take and should continue to take a much less sophisticated design approach that I call the smart UI. But smart UI is an alternate, mutually exclusive fork in the road, incompatible with the approach of domain driven-design. If that road is taken, most of what is in this book is not applicable."
    • If a project needs to deliver simple functionality, dominated by data entry and display, with few business rules, and the staff is not composed of advanced object modelers, it may warrant using the smart UI pattern.
  • Other Kinds of Isolation
    • "Unfortunately, there are influences other than infrastructure and user interfaces that can corrupt your delicate domain model."
    • Chapters 14 and 15 will deal with a number of these issues.

Monday, May 11, 2020

Renaming Tables and Columns in PostgreSQL with a Zero-Downtime Pipeline

As any programmer will tell you, one of the hardest things in programming is naming things. There are many reasons for this, and we don't need to get into the weeds with it, but this leads to a fairly common scenario where something is given a poor name, and later on it needs to be refactored to be given a better name. (Or possibly it had a great name to begin with, but as the code evolved, the name became outdated. This leads to the same scenario. It needs to be renamed.)

But when it comes to databases, programmers frequently find that renaming things is hard. In addition to writing the migration to rename the table or column, you'll also need to scour all the queries in your code and make sure that you've changed all references. And if you missed something, you won't know about it until you or someone else happens to hit that query during runtime. (If things were done right, you'll have an automated test suite that you can run that will instantly tell you whether or not you missed something, but in the real world, less than ideal situations are common, and you very well might not have this luxury.)

And with zero-downtime deploys, things become even trickier. While in many programmers' minds they believe that all changes in a commit go out simultaneously, this actually is not the case. The process of deploying, which involves running migrations, bringing down old servers, starting up new servers, etc., takes time, and these parts can't happen all at once. As such, a zero-downtime deploy pipeline will often look something like this:
  1. The migration is run against the database
  2. The old application containers (of which there are many instances) are brought down one at a time and replaced with the new application containers in turn.
This results in a period of time where there are both old application servers and new application servers running. This time is short, but it is still there. And any user that is on an old application server after the migration is run will get errors because they are trying to reference the old table or column name.

Some programmers, knowing about these issues, will wait until off hours to run the deploy. In best case scenarios, this has you up late at night, outside of normal work hours, running a deploy. In worst case scenarios, where you have a global audience, this isn't even an option.

As a result of all these issues, I've frequently run into situations where programmers simply forgo renaming things in the database, which results in most of the code having the cleaner, clearer refactored name, but once you reach the database level, you're looking at the old ugly name.

To be quite frank, there had to be a better way to deal with this, so that sent me doing some looking and researching, and something that I discovered is that in Postgres views are updatable if they are kept simple enough. An article on this can be found here.

So with this being the case, we can use views to create aliases of sorts, and I'll give examples for table names and column names below.

But before I get to the examples, I'd like to address a concern that I've heard regarding views. The concern is that through using views the database will be drastically slowed down because the views do not have access to the indexes on the table. This is simply not true. Views in Postgres are implemented through rules, and the query planner is more than capable of taking a query referencing the view and the query embedded inside the view and combining and optimizing them. Now, if you have some rather complex views, or views referencing views, or other such things, then the query planner might not be intelligent enough to properly optimize it in the same way that a hand crafted query would, but that won't be a concern for the very simplistic views that we use below.

Renaming a Table


Let's walk through an example of renaming a table. In the query below I create a table with the name "old_table_name" which I'll plan on refactoring to the name "new_table_name". I also insert some basic data in to this table.
create table old_table_name (
    id serial primary key,
    some_value text not null
);

insert into old_table_name (some_value) values
('Value 1'),
('Value 2'),
('Value 3'),
('Value 4'),
('Value 5'),
('Value 6'),
('Value 7'),
('Value 8'),
('Value 9'),
('Value 10');
Now as a way verify that the migrations that we'll write won't be causing problems, let's insert a large amount of data into this table. By running the following query 19 times we'll end up with over 5 million rows in the table.
insert into old_table_name (some_value)
select some_value
from old_table_name;

select count(*)
from old_table_name;
Additionally, here are some crud operation queries that we can use to represent the kind of queries that we'll find in the code. We'll want to make sure that these don't break during the migration. Note that I've intentionally designed these four queries so that if they are run in order, it will return the table back to it's initial state. This gives us a way to repeatedly test our crud operations.
select id, some_value
from old_table_name
where id = 1;

update old_table_name
set some_value = 'New Value'
where id = 1
returning id, some_value;

delete from old_table_name
where id = 1;

insert into old_table_name (id, some_value) values
(1, 'Value 1')
returning id, some_value;
Now, for our first migration. We'll add a view with the "new_table_name" that we're wanting to use and have it reference the table using the "old_table_name". This should put us in a situation where queries using the "old_table_name" and queries using the "new_table_name" will both work. After running this migration, we can rerun the above crud operation queries and verify that they still work. Note that this migration ran in 50 milliseconds (times will of course vary).
create view new_table_name as
select id, some_value
from old_table_name;
This then allows us to update the queries in the code at our leisure, whether that's part of the deploy with the above migration, or in a subsequent migration, or even many subsequent migrations. This puts us in a nice situation where all renames are not immediately required. We'll discuss the benefits of this later. In any case, over time you'll want to update all of your queries. We can take the above four crud operation queries and update them to use the "new_table_name" and verify that the queries work:
select id, some_value
from new_table_name
where id = 1;

update new_table_name
set some_value = 'New Value'
where id = 1
returning id, some_value;

delete from new_table_name
where id = 1;

insert into new_table_name (id, some_value) values
(1, 'Value 1')
returning id, some_value;
Once you're certain that you've updated all queries referencing the table to use the "new_table_name", then you can run the final migration to drop the view and update the table name. Note that you'll want to run these two queries inside of a transaction. Be sure that you understand how your migration tool works and how it handles transactions, because it may not mean explicitly running "begin;" and "commit;" as I'm showing below. In my tests dropping the view took 45 milliseconds, and renaming the table took 45 milliseconds, so even though this migration will lock up the table with an access exclusive lock, everything will still work fine, because the amount of time is inconsequential.
begin;
drop view new_table_name;

alter table old_table_name rename to new_table_name;
commit;
After running the migration, the queries referencing the "new_table_name" work, and any queries referencing the "old_table_name" will not.

Renaming a Column


This will have a number of similarities to table renaming, along with some key differences. So to set up our example, we'll create a table with a column named "old_column_name" that we intend to update to "new_column_name", and we'll add some data.
create table table_name (
    id serial primary key,
    old_column_name text not null
);

insert into table_name (old_column_name) values
('Value 1'),
('Value 2'),
('Value 3'),
('Value 4'),
('Value 5'),
('Value 6'),
('Value 7'),
('Value 8'),
('Value 9'),
('Value 10');
Once again, just to verify that our migrations won't cause issues with large amounts of data, we'll run the following query 19 times to give ourselves over 5 million rows.
insert into table_name (old_column_name)
select old_column_name
from table_name;

select count(*)
from table_name;
And here are our crud operation queries referencing the "old_column_name", which will be representative of queries in our application code:
select id, old_column_name
from table_name
where id = 1;

update table_name
set old_column_name = 'New Value'
where id = 1
returning id, old_column_name;

delete from table_name
where old_column_name = 'New Value';

insert into table_name (id, old_column_name) values
(1, 'Value 1')
returning id, old_column_name;
At this point we can run a migration that gives the table a temporary name, and creates a view with the table name, that provides columns with both the "old_column_name" and the "new_column_name", where the "new_column_name" is just an alias to the "old_column_name". Note that These two queries should be run inside of a transaction. In my tests, the table rename ran in 45 milliseconds, and the create view ran in 47 milliseconds. After the migration runs, you can verify that the above crud operation queries still work.
begin;
alter table table_name rename to temp_table_name;

create view table_name as
select id, old_column_name, old_column_name as new_column_name
from temp_table_name;
commit;
At this point we can change the queries above to reference the "new_column_name" and verify that they still work.
select id, new_column_name
from table_name
where id = 1;

update table_name
set new_column_name = 'New Value'
where id = 1
returning id, new_column_name;

delete from table_name
where new_column_name = 'New Value';

insert into table_name (id, new_column_name) values
(1, 'Value 1')
returning id, new_column_name;
Once we are certain that all queries have been updated to reference the "new_column_name", we can then run a migration to drop the view, rename the table back to its original name, and rename the column to the "new_column_name". Once again, these queries should be run inside of a transaction. In my tests, dropping the view took 45 milliseconds, altering the table name took 46 milliseconds, and renaming the column took 45 milliseconds.
begin;
drop view table_name;

alter table temp_table_name rename to table_name;

alter table table_name rename old_column_name to new_column_name;
commit;
After this migration is run, then only the queries referencing the "new_column_name" will work, and the ones referencing the "old_column_name" will no longer work.

So with the how to out of the way, let's discuss a few benefits. The first benefit is that by breaking thing into at least two separate deploys, the first with the first migration and the changes to the queries, and then a follow up with the final migration, we are able to rename a table or column, and have a zero-downtime deploy go off without a hitch. During that short period of time where there are still old services making database calls in the old format as well as new services using the new format, everything will work without issue.

But what's perhaps even more beneficial here is the fact that you actually don't need to update all the queries at once. You could have one deploy that only runs the first migration that adds the view, then after that you could have one or many deploys updating the queries in the code, and these updates can be spread over time. Then once you know that all necessary queries have been updated, you could run the final migration removing the view and making the final changes to the table. There are multiple benefits that can come from this.

The first of these benefits is small batch size. There are many sources out there that discuss the benefits of small batch size, which involve fewer bugs, code that is easier to review and deploy, and the ability to be more responsive to changing priorities. And this gives us the ability to do small batches. Instead of being forced to update all queries in the database all at once, I can instead make smaller changes, where perhaps a deploy simply changes one or two queries. It gives the developer more control and say over how to handle things.

Another benefit that can come from this is that it can help with those scenarios where you're not sure if you managed to find and update all of the references. There is nothing that says that that final migration needs to be run immediately. It could instead make sense to have a period of time where you wait and you monitor logs to see if there are any queries coming through that are referencing the old names (this is assuming that you are logging the queries that are run against your database). Once you've had a period of time where no references have been made to the old name, then you can run the final migration with some level of confidence that nothing is going to break.

And so with that, we have a strategy for renaming things at the database level that is fairly straightforward and simple to implement that gives us the flexibility that we need to do the job with a level of confidence that we won't be breaking things in the process.

Saturday, May 9, 2020

Domain Driven Design Chapter 3 Summary

Chapter 3: Binding Model and Implementation

    • "The first thing I saw [...] was a complete class diagram [...] that covered a large wall. [...] As large as the wall size diagram was, the model did capture some knowledge. After a moderate amount of study, I learned quite a bit (though that learning was hard to direct [...]). I was more troubled to find that my study gave no insight into the application's code and design. [...] Because the model was "correct", the result of extensive collaboration between technical analysts and business experts, the developers reached the conclusion that conceptually based objects could not be the foundation of their design. So they proceeded to develop an ad hoc design."
    • "The project had a domain model, but what good is a model on paper unless it directly aids the development of running software? [...] Domain-driven design calls for a model that doesn't just aid early analysis but is the very foundation of the design."
  • Model-Driven Design
    • "Tightly relating the code to an underlying model gives the code meaning and makes the model relevant."
    • An analysis model, a model meant for understanding only and where mixing in implementation concerns is considered bad practice, fails to accomplish its goals:
      • It is not created with design issues in mind, and is impractical for those needs.
      • While some knowledge crunching happens, it is lost when coding begins.
      • It will go into depth about some irrelevant subjects, while it overlooks some important subjects.
      • Discoveries always emerge during the design/implementation effort.
    • "Model-Driven Design discards the dichotomy of analysis model and design to search out a single model that serves both purposes."
    • To make the model relevant:
      • Design a portion of the software system to reflect the domain model in a very literal way, so that mapping is obvious.
      • Revisit the model and modify it to be implemented more naturally in software, even as you seek to make it reflect deeper insight into the domain.
      • Demand a single model that serves both purposes well, in addition to supporting a robust Ubiquitous Language.
      • Draw from the model the terminology used in the design and the basic assignment of responsibilities.
      • The code becomes an expression of the model, so a change to the code may be a change to the model. Its effect must ripple through the rest of the project's activities accordingly.
      • To tie the implementation slavishly to a model usually requires software development tools and languages that support a modeling paradigm, such as object-oriented programming.
  • Modeling Paradigms and Tool Support
    • "Object-oriented programming is powerful because it is based on a modeling paradigm, and it provides implementations of the model constructs. [...] Although many developers benefit from just applying the technical capabilities of objects to organize program code, the real breakthrough of object design comes when code expresses concepts of a model."
    • Example: From Procedural to Model Driven
      • The example discusses the use of a PCB layout tool, and how it will try to find the optimal paths for PCB nets. But the software does not support the concept of buses, which are groupings of nets that follow the same path, essentially connecting multiple pins between two components. Procedural code has been written to process the layout tool's data files and use a naming convention to define buses, but what this code can do is limited and messy, because of it's procedural nature. By using an object oriented paradigm, much more powerful and flexible concepts are able to emerge.
  • Letting the Bones Show: Why Models Matter to Users
    • "In theory, perhaps, you could present a user with any view of a system, regardless of what lies beneath. But in practice, a mismatch causes confusion at best -- bugs at worst."
    • Microsoft Internet Explorer Favorites Example:
      • "A user of Internet Explorer thinks of "Favorites" as a list of names of Web sites that persist from session to session. But the implementation treats a Favorite as a file containing a URL, and whose filename is put in the Favorites list. That's a problem if the Web page title contains characters that are illegal in Windows filenames. Suppose a user tries to store a Favorite and types the following name for it: "Laziness: The Secret to Happiness". An error message will say: "A filename cannot contain any of the following characters: \/:*?"<>|". What filename? On the other hand, if the Web page title already contains an illegal character, Internet Explorer will just quietly strip it out. The loss of data may be benign in this case, but not what the user would have expected. Quietly changing data is completely unacceptable in most applications."
      • Either expose the fact that Favorites are just a collection of shortcut files, and let users leverage what they know about the file system to their benefit, or store the Favorites in a different way, so that they can be subject to their own rules, which rules would presumably be the naming rules that apply to Web pages. Either option would provide a single model that tells the user everything that he needs to know.
  • Hands-On Modelers
    • "Manufacturing is a popular metaphor for software development. One inference from this metaphor: highly skilled engineers design: less skilled laborers assemble the products. This metaphor has messed up a lot of projects for one simple reason == software development is all design."
    • "If the people who write the code do not feel responsible for the model, or don't understand how to make the model work for an application, then the model has nothing to do with the software. If developers don't realize that changing code changes the model, then their refactoring will weaken the model rather than strengthen it. Meanwhile, when a modeler is separated from the implementation process, he or she never acquires, or quickly loses, a feel for the constraints of implementation. The basic constraint of Model-Driven Design -- that the model supports an effective implementation and abstracts key domain knowledge -- is half-gone, and the resulting models will be impractical. Finally, the knowledge and skills of experienced designers won't be transferred to other developers if the division of labor prevents the kind of collaboration that conveys the subtleties of coding a Model-Driven Design."
    • "Any technical person contributing to the model must spend some time touching the code, whatever primary role he or she plays on the project. Anyone responsible for changing code must learn to express a model through the code. Every developer must be involved in some level of discussion about the model and have contact with domain experts. Those who contribute in different ways must consciously engage those who touch the code in a dynamic exchange of model ideas through the Ubiquitous Language."

Friday, May 8, 2020

Domain Driven Design Chapter 2 Summary

Chapter 2: Communication and the Use of Language

    • "A domain model can be the core of a common language for a software project."
    • "The model is a set of concepts built up in the heads of people on the project, with terms and relationships that reflect domain insight."
    • "To make most effective use of a model, it needs to pervade every medium of communication."
  • Ubiquitous Language
    • Domain experts:
      • have limited understanding of the technical jargon of software development.
      • use the jargon of their field.
    • Developers:
      • may understand and discuss the system in descriptive, functional terms, devoid of the meaning carried by the experts' language.
      • may create abstractions that support their design, but are not understood by the domain experts.
    • This causes a linguistic divide, where domain experts vaguely describe what they want, and developers vaguely understand.
    • With conscious effort, the domain model can provide the backbone of a common language.
    • Ubiquitous language:
      • includes names of classes and prominent operations.
      • includes terms to discuss rules that have been made explicit in the model.
      • is supplemented with terms from high-level organizing principles imposed on the model.
      • is enriched with the names of patterns the team commonly applies to the domain model.
      • meanings of words and phrases echo the semantics of the model.
    • "The more pervasively the language is used, the more smoothly understanding will flow."
    • Points to apply:
      • Use the model as the backbone of a language.
      • Commit the team to exercise that language in all communication, including:
        • Code
        • Diagrams
        • Writing
        • Speech (especially)
      • Iron out difficulties by experimenting with alternative expressions, which reflect alternative models.
        • Then refactor the code to conform to the new model.
      • Resolve confusion over terms in conversation.
      • Recognize that a change to the ubiquitous language is a change to the model.
      • Domain experts should object to terms or structures that are awkward or inadequate to convey domain understanding.
      • Developers should watch for ambiguity or inconsistency that will trip up design.
    • Example: Working Out a Cargo Router
      • Gives two examples of a conversation between a developer and a domain expert, one where the developer primarily uses software technical terms, and one where the code reflects a model and a shared language, demonstrating the conciseness of the conversation.
  • Modeling Out Loud
    • "One of the best ways to refine a model is to explore with speech, trying out loud various constructs from possible model variations. Rough edges are easy to hear."
    • As an addendum to the ubiquitous language pattern:
      • Play with the model as you talk about the system.
      • Describe scenarios out loud using the elements and interactions of the model, combining concepts in ways allowed by the model.
      • Find easier ways to say what you need to say, and then take those ideas back down to the diagrams and the code.
  • One Team, One Language
    • "Technical people often feel the need to "shield" the business experts from the domain model. [...] Of course there are technical components of the design that may not concern the domain experts, but the core of the model had better interest them. Too abstract? Then how do you know the abstractions are sound? Do you understand the domain as deeply as they do? [...] [A] domain expert is assumed to be capable of thinking somewhat deeply about his or her field. If sophisticated domain experts don't understand the model, there is something wrong with the model."
    • "The domain experts can use the language of the model in writing use cases, and can work even more directly with the model by specifying acceptance tests."
    • "Multiplicity of languages is often necessary, but the linguistic division should never be between the domain experts and the developers."
  • Documents and Diagrams
    • "Simple, informal UML diagrams can anchor a discussion. Sketch a diagram of three to five objects central to the issue at hand, and everyone can stay focused."
    • "The trouble comes when people feel compelled to convey the whole model or design through UML. A lot of object model diagrams are too complete and, simultaneously, leave too much out."
      • People feel it needs to show all the detail they will code.
        • With all that detail, no one can see the forest for the trees.
      • Yet in spite of that detail, important information is still missing.
        • Behavior and constraints are not so easily illustrated.
        • This falls to supplemental text or conversation
    • "Diagrams are a means of communication and explanation, and the facilitate brainstorming. They serve these ends best if they are minimal."
    • "The vital detail about the design is captured in the code."
    • Written Design Documents
      • "[M]aking written documents that actually help the team produce good software is a challenge."
      • "Once a document takes on a persistent form, it often loses its connection with the flow of the project. It is left behind by the evolution of the code, or by the evolution of the language of the project."
      • Two general guidelines for evaluating a document:
        • Documents Should Complement Code and Speech
          • "A document shouldn't try to do what the code already does well. The code already supplies the detail. It is an exact specification of program behavior."
          • "Other documents need to illuminate meaning, to give insight into large-scale structures, and to focus attention on core elements." 
          • "Documents can clarify design intent when the programming language does not support a straightforward implementation of a concept."
        • Documents Should Work for a Living and Stay Current
          • "A document must be involved in project activities."
          • If a document in not read or found to be compelling or is being left behind, then the document is not relevant or not important enough to update.
          • "It could be safely archived as history, but left active it could create confusion and hurt the project. And if a document isn't playing an important role, keeping it up to date through sheer will and discipline wastes effort."
    • Executable Bedrock
      • Well written code can be very communicative, but to ensure that it communicates the correct message takes effort.
      • The behavior of the code is indisputable, but that does not mean that what the written code says reflects this behavior. Misnamed or unclear class, function, and variable names can pollute the meaning.
      • "It takes fastidiousness to write code that doesn't just do the right thing but also says the right thing."
      • "To communicate effectively, the code must be based on the same language used to write the requirements -- the same language that the developers speak with each other and with domain experts."
  • Explanatory Models
    • "The model that drives the design is one view of the domain, but it may aid learning to have other views, used only as educational tool, to communicate general knowledge of the domain."
    • "One particular reason that other models are needed is scope. The technical model that drives the software development process must be strictly pared down to the necessary minimum to fulfill its functions. An explanatory model can include aspects of the domain that provide context that clarifies the more narrowly scoped model."
    • Example: Shipping Operations and Routes
      • The example starts by showing a UML diagram of part of the model. While accurate the meaning isn't readily transparent.
      • Next a more free form diagram following a timeline is given, which much more clearly conveys some ideas that were originally shown in the UML diagram.
      • Together they are easier to understand than either view alone.