4. From abbreviation to abbreviation

The main integration concepts

Raymond Meester
11 min readApr 5, 2024

Things are going well with Aleksandra’s cakes and pies. Her sister is now really pushing her to make from this a business. But that suddenly involves a lot. Before she starts buying new pans and knives, she first draws up a business plan.

Aleksandra could not bring many items from Ukraine. Fortunately, she received several kitchen tools from a variety of neighbors and the community. However, some of the items will need to be professionalized. Tools for baking, an oven and a freezer.

Aleksandra will check with a local training institute for bakers, the Bakery Institute, for what all she needs. In addition, she will need to set up a sole proprietorship. So she will have to learn something about doing business in the Netherlands. About taxes, insurance and marketing. An entrepreneur friend of hers has promised to guide her through the Dutch regulatory maze.

The toolbox of an integration specialist

Everyone has an opinion ready. For example, everyone has an opinion about the cakes Aleksandra bakes. And her sister on how to make money with a business. And so do 18 million opinions about the Dutch soccer team. Yet there must be professionals who look beyond opinions. Who have real knowledge of the concepts to get professional results.

Aleksandra has knowledge of baking. Her friend of business. And the national coach of soccer. What knowledge do you need as an integration specialist? In this chapter, we will take a closer look at the technologies used behind data integrations. You can think of each technical component as a tool for solving an integration problem. By the end of the chapter, your toolbox will be well stocked.

Products

There are hundreds of types of integration tools, frameworks, libraries and products. One well-known list is Capterra Integration Sofware. This list includes well-known product suites as well as small tools that deal laterally with data integration.

The list does give a good idea of the amount of software packages in the market, which is enormous. There are hundreds of them, but that also makes it unclear what you can do with the various tools and products. There are a lot of terms and abbreviations used in the descriptions, but the question is what they mean. Let’s find out.

Technical integration

Technical integration includes middleware components that transport data without dealing with the content. The data remains unchanged.

We cover three categories of technical integration components:

  1. Remote Procedure Calls
  2. Gateways
  3. Brokers

Each category and some subcategories are discussed. Some software products are also given in each case. At the end of the book there is a more complete example list for each category with links to the relevant software.

RPC

Remote procedure calls (RPC) allow systems to execute a remote function. As systems execute each other’s processes, there is integration between multiple systems. This is actually the traditional way to integrate systems with each other.

An RPC call is initiated by a client calling and executing a function on a remote server. This may include parameters and data. Once the procedure is executed, a result or confirmation is returned. Modern variants often use XML or JSON.

Remote Procedure Calls

Unlike executing processes within a system, a remote procedure can cause the process to crash because the remote system is temporarily unavailable or too slow.

A Remote Procedure Call is like ordering in a restaurant. You specify what you want to eat and drink. Then you wait a while and then the food is served. It usually goes well, but sometimes a thing or two go wrong in the kitchen.

Examples RPC Software:

Gateways

A gateway is a pass-through between two applications. It ties together different transport protocols and allows for the exchange of messages. You could compare it to the function of a crane at a port transshipment site.

The crane transfers a container from one means of transportation to another. For example, from a ship to a train. A gateway can move files between protocols, like from an SFTP server to an HTTP Web service.

File gateway

In its simplest form, a file gateway is an adapter between two components. A file adapter will usually actively retrieve or deliver a file. It waits for a particular event to occur and then takes action. Some adapters are written specifically as a bridge between two protocols. Others are versatile and can handle all kinds of transport protocols, for example, SFTP, HTTP and JDBC.

File Management

In simple situations, with low message volumes or with high influence on the technology used, an adapter as a file gateway may be sufficient. Once the number of adapters and message volumes increase, the need for central management and monitoring arises. In this case, adapter frameworks or MFT (Managed File Transfers) systems are needed.

MFTs offer the following three functionalities:

  1. Support for multiple protocols and formats
  2. Central configuration & management
  3. Monitoring

Examples:

In small companies, an MFT solution is often sufficient for all integration needs; in large companies, it is one component combined with others.

API Gateway

An API Gateway decouples APIs. All traffic between the API or simply calling APIs passes through this gateway. In doing so, the API Gateway takes care of security, load balancing and caching, among other things. More on the subject of APIs can be found in the next chapter.

Examples:

B2B network

A network where a file and/or API gateway between organizations is provided (business 2 business). Standards such as HL7, Edifact and GS1 are usually supported.

Brokers

A gateway is only a conduit between two components. The management tools of file gateways indicate how much and to whom something goes, but not how and what. For this, a broker can be used. A broker is a middleware component on which producers can post messages and consumers can retrieve messages.

A broker is an important component that further decouples applications from each other. The broker layer provides a generic endpoint, a generic protocol and often caching and buffering of messages. You could compare it to a distribution center. At one end, manufacturers deliver goods, while stores pick them up. The distribution center handles intermediate distribution and storage.

Basically, there are two types of broker endpoints:

  • Queue — Endpoint with a queue of messages. If no response is needed it uses a Point-to-Point pattern otherwise Request-and-Reply.
  • Topic — Endpoint on which messages can be posted, everyone subscribed to this endpoint will get a copy (Publish-Subscribe).

Exactly how an endpoint works depends heavily on the technical implementation of these concepts. Here are some examples.

JMS Broker

A JMS broker is an implementation of the JMS (Java Message Service) API. This library has support for both queues and topics. Because many integration platforms are created in Java, you see JMS Brokers a lot in product suites. Think of products like Tibco, Websphere, Oracle Fusion and Mule.

JMS is a powerful library, but heavily tied to Java and the broker implementation. Thus, you always need a client library to interface with a JMS broker. This does not necessarily have to be in Java, but clients in other programming languages are scarce.

Examples:

MSMQ

MSMQ is an implementation of message queueing by Microsoft. It is not a full-blown broker, but uses queues (temporary location for storing messages). It is mainly used within Windows, by Bizztalk and .Net applications.

Example: Bizztalk

AMQP Broker

Unlike JMS and MSMQ, AMQP is not an implementation, but a protocol. It specifies the data formats that must be met by a broker or application implementing this protocol. Therefore, there are implementations and clients in a variety of programming languages. AMQP is generally a bit more extensive in its capabilities of queues and topics than JMS. For all the differences, see the following article:

Understanding the differences between AMQP & JMS

Examples:

MQTT Broker

MQTT is a lightweight broker that assumes topics. It is designed to use as little bandwidth as possible, which is why this broker is mostly in use in IOT (Internet of Things) use cases. For example, collecting data from sensors.

Examples:

ActiveMQ and RabbitMQ also have support for MQTT.

STOMP Broker

A STOMP broker implements the STOMP protocol. STOMP stands for Simple (or Streaming) Text Oriented Message Protocol. It is similar to HTTP. HTTP is implemented on top of TCP with various actions such as GET and POST. The STOMP protocol is also implemented on top of TCP, but adds some actions specific to message oriented middleware, such as SEND and SUBCRIBE.

Examples:

ActiveMQ and RabbitMQ also have support for STOMP.

Streaming broker

Streaming brokers are designed so that data in topics is distributed across multiple nodes. As a result, these distributed brokers are highly scalable and can handle a lot of data.

Examples:

In the Queueing vs Streaming broker section, we will go into more detail about how brokers work and what the differences are.

Apache

Many technical middleware components are developed under the umbrella of the Apache Foundation. The most well-known projects are Apache Kafka, Apache ActiveMQ and Apache Camel.

The Apache Foundation is a foundation supported by large (software) companies. It provides open source licenses, legal support, hosting and so on. All of the Apache foundation’s projects can be found at apache.org.

Often Apache projects are not ready-made end products, but frameworks and libraries that multiple companies are working on. This software often operates in a backend of companies such as Facebook and Google. Sometimes a combination of multiple Apache projects is offered as a commercial product (Red Hat and Cloudera are well-known companies that have Apache Foundation products in their portfolio).

Functional integration

Functional integration involves middleware components that bridge data content.

The main categories are:

  1. DataFlow
  2. ESB

Dataflow

Dataflow is a style of integration based on “flow-based programming.” Coined by J. Paul Morrison in the 1970s, this style is based on components that process data messages and can be dynamically connected to each other. Each component acts as a black box with inputs and outputs. It is therefore a generic method of data processing (also called stream processing) that can be used for various data integration tasks.

Dataflow programming with Node-Red

The best-known dataflow middlewares are Apache NiFi, Spring Cloud Data Flow and Node-Red. All three projects are open source. NiFi has its origins with the U.S. intelligence agency NSA, Spring Cloud Data Flow with VMWare and Node-Red was originally developed at IBM.

See also: https://en.wikipedia.org/wiki/Flow-based_programming

Examples:

ESB

Enterprise Service Bus is a somewhat confusing term. It has nothing to do with buses you take to the station. It also has nothing to do with USB (Universal Serial Bus). At least USB puts us on the right track. A bus generally means data exchange between hardware components. A service bus involves data exchange between software components.

The idea is that the bus provides a set of standard services within an enterprise , which can be called from various other software components.

Multiple services can also be called in succession. Depending on the software vendor, this is referred to as processes, routes or flows. The number of service types and combinations of services are described in the book Enterprise Integration Patterns by Gregor Hohpe and Bobby Woolf which is discussed earlier.

The term ESB is currently falling somewhat into disuse. Usually it is now simply referred to as Service Bus or is part of an integration platform (For example, Mule Anypoint).

Examples:

ESB Frameworks

Special mention deserves frameworks, such as Apache Camel, Spring Integration and Zato. These are frameworks that allow you to build Enterprise Integration Patterns into software. They also serve as the basis for various ESB products. For example, Apache Camel is a basic component of platforms like Apache ServiceMix, Red Hat Fuse, Talend ESB, SAP Integration, Dovetail and Talisman.

Integration-related software

A characteristic of middleware is that it is often cross-domain. The following components are components that are also deployed organization-wide and are therefore frequently combined with data integration as part of functional integration:

BRE

Business Rules Engine execute automated business rules. Because BRE systems often work independently of applications and are deployed company-wide, this requires a lot of data traffic.

Workflows, like data flows, are series of steps that are performed. They can be taken by a human or system. Usually it is a long-running process with a number of tasks (amount of work) being performed. Often these tasks are noted in a data format, such as XML. This allows some part of a workflow to be performed by a data integration.

BPM

Business Process Management systems are applications that execute workflows based on BPMN notation. Like a workflow, a BPM workflow is a series of steps to capture and automate certain work processes. Often these consist partly of manual and partly of automated steps. Because they involve a lot of data entry and enrichment, they involve a lot of data operations or are linked to middleware, such as brokers, gateways or an ESB.

Examples:

Pipelines

Pipelines are related to processes and workflows, with the difference that there is always an outcome, for example, a software build or deployment (Continuous Delivery Pipeline). Usually a pipeline is started by a human or an event and goes through a number of (linear) steps.

ETL

ETL (Extract, Transform and Load) is an example of a pipeline for data. It is a series of steps to make data suitable to be stored, combined and analyzed in a standard format. Usually, this is data for a data warehouse, where data is gathered from different sources to form a single coherent whole here.

Examples:

And now the key question?

The question that is left for us, which tool do I use for what now? This question is difficult (if not impossible) to answer generally. This is because it depends on, for example:

  • organization size
  • type of technologies
  • knowledge in the organization and the market
  • architectural vision
  • tech trends

Very roughly, it can be indicated that smaller organizations can suffice with an MFT product or an iPaaS (integration Platform as a Service). Larger organizations often have a combination of:

  • MFT (Files sync and sharing)
  • API Management (APIs).
  • ESB/Broker (Integrations)
  • iPaaS (Cloud integrations)
  • Gateways (Links)

In the sequel we will go deeper into the different technologies, so that it is always clear, which middleware and concepts are used for what.

--

--