7. Different styles

Which architectures are used in data integration

Raymond Meester
16 min readApr 8, 2024

Aleksandra is happy with all the demand. She is also happy that she is now completely independent and no longer has to rely on anyone. In fact, she already has a number of employees. Some of them come also from Ukraine, others just from the region. One has even opened a bakery in Rotterdam that is supplied from Amsterdam.

It’s time for the next step. Aleksandra’s Bakery will be renamed The Bakery Group. The new logo:

Now making a logo is fun, but there’s a lot more about it. By now there are quite a few systems in place. She uses Salesforce for resource planning and finance, she has a customer relationship management system, a website and a cloud application for recipes. Systems that need to grow with the size of the bakery. And all those systems also need to integrate well with each other and with the suppliers’ systems.

Slowly, a successful bakery with a professional way of working is emerging.

The bakery

The input is still “ingredients” and the output is “cakes,” only now not for one neighbor, but for thousands of “neighbors”… Before the cakes roll off the assembly line, however, a bakery must first be in place. Premises had to be rented and a loan taken out to buy machinery.

Aleksandra has engaged an architect. With him she talks about the building, the layout of the bakery, the machines that produce the cakes, a docking platform for trucks and the office. She would also like a relaxation area and a kitchenette with healthy food. It can’t be a celebration every day!

A big dream, a big investment. Both Aleksandra and the architect she hired want the best possible result. But the budget and ultimate efficiency must not be lost out of sight. At the bottom line, there is still something to be earned.

The architect

The role of an architect is to bring together the desires of the client, builders and materials. The Order of Architects states:

In practice, the role of the architect is not limited to drawing up plans, necessary for applying for the permit, and checking the works carried out. It also includes practical guidance throughout the construction process based on the specifically agreed architectural brief.

Big data integration projects have the same requirements to be successful. For this, requirements, design and architecture are very important. After all, you can buy integration software components, but this does not guarantee that they will meet the organization’s requirements and needs. If these things are not in order, rather the opposite…

Requirements

An integration architect is responsible for the integration layer. This layer consists of the integration platform (the factory) on the one hand, and the integrations themselves (the products that roll off the assembly line) on the other. For the latter, solution architects are sometimes used to hoe focus on achieving a specific solution. The integration architect also works closely with the enterprise architect, who again looks organization-wide. In smaller organizations, this may be one and the same person.

The basic materials available to an integration architect are the software components (the machines), such as a broker or an API Management Platform. At this level we still talk about tools and frameworks. Once these tools and frameworks are installed and aligned we talk about an integration platform; the factory is in place.

The very first step in setting up an integration platform is to identify the requirements and needs of the organization, also known as establishing business requirements. Sometimes the help of a third-party vendor and technology-independent consultant is sought here.

The consultant

A lot of applications are already in use within The Bakery Group. An HR and administration system. And, of course, a production, transport and warehouse system. Most are already connected to each other through a number of point-to-point integrations.

A desire is being expressed within IT management for a true integration platform. Meanwhile, the MT is already sprinkling various abbreviations from Chapter 4. But truly realizing an integration layer proves difficult. It is decided to hire an external integration consultant to advise the company on the implementation.

Integration platform checklist

The very first thing the consultant does when he walks into the Bakery Group is to get the business goals clear and identify the corresponding requirements and wishes for the integration layer. A non-exhaustive list of questions he/she asks is:

Business processes

  • Is there a list of business processes?
  • Is there a list of (desired) integrations?
  • How do the integrations relate to business processes?

Data streams

  • What information flows between systems now?
  • Is there an architecture overview?
  • What are the architecture guidelines?
  • What architectural styles are in use within IT?
  • What middleware software is being used now?

Per integration

  • What applications are the source?
  • What applications are the target?
  • What protocols are used?
  • What data formats (XML, JSON) are used?
  • What transformations are there (Data mapping, Canonical data model)?

Technical knowledge

  • What knowledge about integration is available?
  • What knowledge is desired?
  • What is each of IT staff involved, role and responsibility?

Robustness

  • How should this data be secured or authorized?
  • How sensitive is the system to outages, for example?
  • Has monitoring/alerting/management been considered?
  • Has any thought been given to Tracking & Tracing of data messages?
  • Has Continuous Delivery (Release/Deployment) been considered?

Performance

  • What is the number of posts now and in the future?
  • What hardware/Cloud/OS is being used?
  • What software applications are used?
  • What are the specifications of the system?
  • What are the performance requirements (real time/near real time)?

Architecture Roadmap

Based on the questions from the list, a consultant first zooms in on the integrations and middleware solutions currently in use. Then he goes on to specifically identify the requirements and wishes. Then he zooms out again. From this picture, the desired situation is outlined. Finally, together with the organization, he determines the route from the old to the new situation. The result is the architecture roadmap.

As already indicated, when creating the roadmap, there are a lot of questions to consider. However, not every question needs to be worked out in detail. An integration layer, consisting of a platform and integrations, lends itself well to building up step by step. You don’t have to get to level 10 right away, but you can unlock a new level each time as an organization. “Integration Maturity” this is also called.

Points that can increase integration maturity include improving scalability, or the amount of data the platform can handle. A mix of technical and functional components that make up a complete platform. Establishing guidelines and standards for setting up these components. Making it easier and easier for those who can make the integrations.

Below is an example of such a model:

Each time improvements are realized on the platform, the focus can be shifted to the integrations themselves. When realizing integrations, it is helpful to keep the following in mind:

  • The basic architectural principles
  • The data is central, not the applications
  • Communication is the focus, not technology

Design

The basis for The Bakery Group’s products is the recipe. Different recipes can produce all kinds of different cakes based on customer requirements. In data integration, there is no fixed recipe. However, it usually starts from a design. Based on analysis, architectural principles and best practices, this design is created.

A good design follows a set of fixed elements, so that the design is at the same time the documentation. In other words, a repeat recipe. A typical data integration design includes:

  1. Message: functional description of the message, including a sample message
  2. Producer: description of the producers and protocols they use; connectivity of one or more source applications
  3. Process: the process steps for processing the message; additionally, the software modules/containers that perform the processing
  4. Consumer: description of the target applications (consumers) and the protocols they use

Some designers use mostly text or a spreadsheet in the design, while others use mostly diagrams. Usually it’s a mix of both.

Architectural styles

“The world gets more and more connected” is a phrase you hear a lot these days. The questions “how is this world connected?” and “how to deal with more and more connections?” are architectural questions. Several styles have been developed in response to these questions. Before we get into these architectural styles, let’s go back in time.

1700-year-old monolith (Bolivia)

Monolithic

Data transmission occurred early in the development of computers. As early as the early 1960s, scientists were exploring ways for computers to communicate with each other. The most prominent project was ARPANET, from which the Internet with email (1971) and FTP (1973) evolved. However, those developments still focused primarily on networks. Other developments included the emergence of operating systems (Unix, DOS, OS400), databases (Oracle) and programming languages (COBOL, C).

Even before these developments took place, all kinds of large computers had emerged, called mainframes. Often based on a monolithic architecture. This means that all functionality is implemented in one whole. There was often no other way to do this. No standard applications or containers were available yet, so companies built everything into one large mainframe application. Data was often entered via a command line user interface (terminal) or via an import/export data interface.

Monolithic architecture

In the 1980s and 1990s, standard applications emerged. Think of administrative or warehouse software. This software no longer necessarily had to run on a mainframe, but could run on a Unix or Windows server and use the Internet to exchange data.

Thus, companies were purchasing more and more external applications. Not only more applications, but also multiple protocols and programming languages surfaced. Using database links, scripting and adapters, most applications were linked directly to each other. These point-to-point links formed a bridge between applications.

In many cases, these applications were almost as monolithic as the mainframe systems they succeeded.

EAI

With more and more applications, an application landscape emerged. For organizations, it is important that the various applications in the landscape work as one. Herein you see an important step towards data integration as a separate IT layer.

Just as there are many means of transportation (boats, planes, and trucks) and infrastructure (roads, rail lines and airports) to solve logistic issues, data integration concepts and solutions emerged. Data connected not by direct bridges, but by transportation and infrastructure components.

EAI (Enterprise Application Integration) is a concept where point-to-point links are replaced by middleware. First of all, this limits the number of possible links:

If integration is applied without following a structured EAI approach, point-to-point connections grow across an organization. Dependencies are added on an impromptu basis, resulting in a complex structure that is difficult to maintain. This is commonly referred to as spaghetti, an allusion to the programming equivalent of spaghetti code. For example:

The number of connections needed to have fully meshed point-to-point connections, with n points, is given by ( n 2 ) = n ( n — 1 ) 2. Thus, for ten applications to be fully integrated point-to-point, 10 × 9/2 = 45 point-to-point connections are needed.

https://www.brainspire.com/services/internal-software-integration

EAI ensures that each application is provided with the correct data. Thus, the data of each application are consistent with each other. There is no need to build a separate direct interface between applications each time. All data flows run through middleware. Middleware makes it clear to everyone how the data flows. Still, implementation is difficult because you need central project management and coordination of all data integration. Also, this style leads to circulating many copies of data.

SOA

In the architecture style SOA, service-oriented architecture, as the name suggests, everything revolves around services. After the year 2000, you can see an ever-increasing breakdown of applications into such services. The reason behind this is simple, the mainframe and later the large standard ERP applications are difficult to maintain and change.

Services are software modules that are, as much as possible, independent and separate from other software. Because they are usually available over networks, such as the Internet, they are usually referred to as Web services. Each Web service has a definition of how to invoke the data interface. Unlike EAI, it is not about synchronizing data, but more about retrieving and processing it.

Setting up an SOA Service with the SOAP protocol

Using smaller modules allows IT and the organization to evaluate more easily. In doing so, specific functionality should be easy to be changed without many dependencies on other software. This is usually known as “loosely coupled,” the software components are loosely coupled with each other. Of course, all services do communicate with each other. Each service should have a standardized way of interfacing. The interfaces should be uncluttered, reusable and scalable.

Often the explanation of SOA is rather technical. Therefore, let’s start from our analogy with the pastry shop:

Aleksandra employs several pastry chefs. In SOA, you assume that those pastry chefs are not collegial. Aleksandra therefore lets them communicate only when it is highly necessary and gives each pastry chef his or her own role. One baker makes the dough, one makes the batter, one makes the cakes and the last one does the decorating.

Aleksandra tells the dough maker not only to make dough, but also exactly where to put it. The same goes for the maker of the batter and so on. The cake maker takes the dough and batter as input for the cake and puts the made cakes somewhere. The final baker gets the whole cake and decorates it before putting it in the display case.

So the different bakers are mostly doing their own thing. They depend on each other only to the extent that there is a fixed moment of transition. In this example, you can see that each pastry chef has his own task with a clear procedure indicating where he gets the input and where to put the output.

A service in SOA works the same way. It is a self-contained module with a clear inbound and outbound interface. The advantage of the concept is that you can reuse a service. Just like you can reuse the dough for different types of pastry or cake. You just need to know where the dough is.

To implement and manage an SOA service, an ESB layer is often created where the services can communicate and align. The original Amazon Web Services (AWS) are a well-known implementation of this principle. Yet many implementations have also failed because not the entire application landscape was aligned with it and managing many services quickly became complex.

Microservices

Working with services can be even more fine-grained (and even more complex). You could still call the bakers in SOA artisanal. In an assembly line, you could have specific tasks performed by humans and robots at the micro level. Such a factory is, of course, extremely complex, but it provides the advantage of fixed quality with high output.

For software, the further refinement of services in microservices is especially advantageous in terms of scalability and flexibility.

Microservices build on the SOA architectural style. In doing so, it explicitly does not use an ESB layer, but relies on standard network protocols (HTTPS) that communicate with each other through REST APIs. The underlying technology (programming language/database) is focused on the function of the application, technology trends and knowledge and requirements of the implementation team. A gateway or management tooling is often used to manage the APIs.

Each microservice has only one or a few business functions. The goal, as with SOA, is that such a module is maintainable and testable. Each module can be deployed (usually in a container) independently. Often, the new microservice is then available for use via a service registry.

Because each microservice contains limited business functionality, many microservices are quickly required. This can make the installation, version control and registration of new and changed services as well as dependencies between services very complex.

Other architectural styles

  • Remote calls: Server clients where applications often call each other’s functions, procedures or objects without middleware.
  • RPC (Remote Procedure Calls). Procedure calls. In an RPC style, the procedure is started by a client sending a request message to a server. The server executes the procedure. After handling the procedure, the server sends back a reply.
  • CORBA (Common Object Request Broker Architecture). A standard for data integration between distributed systems. Here, the client calls objects to a remote server (instead of procedures). The CORBA standard was defined by OMG (Object Management Group).
  • MFT (Managed File Transfer): Transferring (transferring) and synchronizing data files. Focuses particularly on security, reliability and large volumes of files. Often applications themselves are responsible for import/export of the files.
  • SCS (Self Contained Systems): An architectural style with an emphasis on functional Separation-of-Concerns. All applications in the application landscape are self-contained Web applications that contain both business logic, data and user interface. Each of these Web applications has its own business function. Data exchange between those applications is as much as possible asynchronous.
  • EDA (Event Driven Architecture): Architecture style that assumes events (occurrences). An event-driven system consists of event emitters (aka agents), event consumers (aka sinks) and event channels. Emitters/Agents detect an event (a state change) and collect the appropriate data. The event consumers receive the result and may need to respond to the input. The event channels provide the interface between emitter and consumer.
  • Reactive: Event-Driven architectural style that assumes responsive applications. If there is a change in an application then a dataflow starts. This dataflow can start another one. All dataflows (also called event streams) process data asynchronously. Many principles of this style are defined in the Reactive manifest.

Practice

Rarely is an application landscape built from a blank page. It often went something like this: in the 1970s and 1980s, administrative and primary processes were automated first. No standard software was yet available, so large monoliths were developed on mainframes.

In the 1990s, standard applications entered the market. Applications became a commodity. Thus, it became cheaper to purchase these applications to complement a custom-made system. Direct links were made between the systems, then more and more specialized middleware products were purchased. Using the middleware, application databases were synchronized (EAI).

Around 2005, companies started decoupling applications using ESB in a SOA style. Finally, with faster changes in the market and the centrality of the Internet, they started deploying APIs based on microservices.

After all, the overall application landscape contains different architectural styles with different types of applications in the landscape. Exactly what the application landscape looks like roughly depends on:

  1. the number of years a company has been in existence
  2. the size of the company
  3. the complexity of the processes
  4. the speed of change in the marketplace
  5. the architecture function

It may well be that a startup that has just emerged uses only microservices without any middleware. The company is small and the processes are simple. Often architecture does not play a big role here yet. Once the company grows, more applications will be added. Current applications will be rebuilt and supplemented with standard applications and new technologies.

Like the business, IT goes along with new trends or hypes. Sometimes because the new is better, sometimes because the old is no longer maintainable (technical debt).

For example, see the following story from 15 years of software experience. This story is about how the application landscape grows until all kinds of (unmanageable) technologies and architectures are used.

This is the practice: a mix between different architecture styles. The cause is that a different style is always implemented, different technologies and business use cases. The practice then is a combination between architecture styles, on premise/cloud, push/pull, synchronous/asynchronous and so on.

Architectural style vs. technology

Many times you see that architecture styles are associated with certain data formats, technologies and protocols. The overview below shows the architecture styles discussed and their associations:

However, it is good to separate the techniques from the architectural style. Microservices can also run on a mainframe, and a monolith can be created with REST APIs running in a Docker container. In practice, you see that these associations get confusing, especially as different styles and techniques are used interchangeably in the organization. Often this is known to architects, but many other IT staff in the organization are unaware of it.

The important thing is to use architecture styles and techniques optimally for organizational purposes. As these are changing more and more rapidly, there is a trend toward architecture styles that explicitly take into account evaluating the application landscape along with the business. Whatever style is chosen, it is always a challenge to implement complexity of an organization into one technical whole and keep the overview of data flows.

The roadmap

Using the software components, requirements and architectural styles, an outline of the integration platform can be created. So this is like the sketch of the factory. The roadmap is the path to that factory. That road may well be long, i.e., parts of the factory should already be operational, while other parts will be added later.

At The Bakery Group, for example, they first focus on standard cakes, such as apple and whipped cream cakes. Later, flans, pastries and wedding cakes should also be possible.

Integration, for example, can focus first on integrations with legacy systems and later on APIs, for example. Do we go for an ESB first or directly for an iPaaS? Once the different building blocks to be laid first are known, the acquisition of these building blocks can be started.

Usually the building blocks build on others. You build part of the factory first, learn from this, and move on. Of course, one of the advantages of software building blocks is that they are more flexible than bricks. One disadvantage, however, is that sometimes it is not clear how much work the implementation costs and what it will yield.

The final stage is to create a product based on the requirements of a specific building block. For example, what a broker or a gateway must meet. Then the selection and procurement (possibly through a tender) can take place. This should be done carefully, as these products will be used quietly for more than 10 years. Third-party assistance with the selection and also running a Proof-Of-Concept with the top 2/3 products is often desirable for large organizations.

--

--