
I would like to share my opinionated tech stack experimentations for data in motion and analytics
In the world of streaming technologies, PostgreSQL stands out thanks to its low-overhead replication protocol. This enables the use of the change data capture (CDC) pattern to replicate data incrementally.
My Reference Architecture diagram
The Image-1 bellow, shows the base architecture used for most of companies that I have worked with. It is a simplified architecture that can be used for most of the companies. Operational view and Analytical view are separated.
Image-1: Simplified architecture separating operational and analytical views
The Image-2 bellow, shows the architecture as I see currently, the analytical view it is a special case of operational view.
Image-2: Simplified architecture considering analytical view as a special case of operational
Technology stack
All the components are open-source and can be run on any cloud provider or on-premises. The technology. I will not list here because everything will be in the code repository.
https://github.com/drr00t/my-data-in-motion-tech-stack
Disclaimer
This is my opinionated tech stack, and it is not a recommendation. It is based on my experience and the experience of the companies I have worked with. It is not a definitive list of technologies, and it is not exhaustive. It is a starting point for your own research and experimentation.