Welcome to the first part of our series on data platforms! In this series, we are going to dive into data platforms. In part 1 of the series, we start out by going over the main principles for platform success.
In our unique role as a data consultancy, we have seen many customers that are gravitating towards the ideas of a data mesh. The decentralised nature of data mesh requires a self-serve data platform to enable the autonomy of the teams.
We have experienced different ways in which our clients have tried to tackle this. We are writing this series to help companies in navigating the complexities of building and managing a successful data platforms.
About The (Data) Platform
It has been known for some time that the later on a problem is found in the development process, the more expensive it is to fix. As a response to this, responsibilities such as security, privacy and financial operations are being pushed further up the chain to the development teams. This push is called the shift-left paradigm.
This shift offers significant benefits, such as identification of issues early on, encouraging a sense of ownership and providing the development teams with autonomy. Next to all the benefits, it does introduce one big challenge: additional cognitive load.
You don’t want the teams that are involved with the development process to be so caught up in the additional responsibilities that they can only allocate a marginal amount of time towards bringing value to the business. So, how can we still get the benefits of the shift-left paradigm, while keeping the cognitive load of the teams at a healthy level?
That is where platforms come in. The aim of a platform is to reduce the cognitive load for their end users by providing them solutions for cross-cutting concerns. This is done by creating enablers, accelerators and abstractions (from here on out I’m going to substitute this with “services”) to remove the complexity and friction of managing the lifecycle of their products, allowing the development teams to focus on what matters. This could be as simple as providing a document with best-practices to reduce cloud cost, creating a library to help end users provision infrastructure that already addresses security requirements and other best practices, or even providing a fully managed service in the case that multiple end-users have a need for the service.
In this series, we are going to focus on platforms developer experience is central and the platform is kept as lean as possible. Team Topologies calls such platforms “Thinnest Viable Platforms” or TVP for short.
Main Principles for Platform Success
A platform that is not used is worthless.
You could try to mandate the use of the platform, but then you lose one of the best metrics to asses how valuable the platform and it’s services actually are: the amount of adoption. A platform that solves problems, will be used.
To make a platform useful, there are 3 main principles to follow:
Create services that the end users of the platform actually need; the service should resolve a real pain point.
Focus on developer experience to make it as easy as possible for the end users to adopt and use a service.
Keep the platform as thin as possible.
Understanding Platform Customers
To be able to create services that the users of the platform actually need, there needs to be a clear understanding of who the end users are.
Understanding the end user is not a one-time activity, it is a continuous process that should start before you begin to work on a platform service, and should continue long after the service is live. Remember, if you make the life of your users easier, they will advocate for your solution, and that is incredibly powerful for adoption.
When starting out with the platform, we encourage you to first segment the types of stakeholders. Each segment will have different needs and preferences. Understanding who is your most influential segment will help you prioritise and focus on their needs first.
From our experience, the main stakeholder segments are:
Developers: they are often the end-users of the services that the platform provides. As they work with the services on a daily basis, and the services are mostly there to make their life easier, this should be your most influential segment.
Product owners: they are interested in the roadmap of the platform, to know when and how the platform will accelerate their team. They are also interested in metrics that show how their “solution” performs. Think about technical maturity, (cloud) cost, technical debt, etc.
Governance: for the sake of illustration, I’ve grouped the various governance bodies like security, privacy and architecture under one umbrella. These bodies define policies which the platform is able to incorporate in the services they provide. A concrete example: security wants all data on S3 buckets to be encrypted by customer-managed keys. The platform can enforce such a policy in their infrastructure-as-code library that they provide to workload teams, ensuring these policies are addressed by design.
Leadership: their main interests are improving the developer experience, being able to verify that the platform team is actually accelerating teams and overall statistics to provide insight into the workloads that the platform serves. Think about (cloud) cost, projection of cost for next year, compliance scores, etc.
When starting out with a platform, we recommend to conduct focus groups with the main stakeholder segments that we have identified. The focus groups will provide a lot of ideas that, after grouping and prioritising, are great input for your backlog.
To be able to keep evolving the platform to best serve the end users, there should be ways for them to provide feedback. We prefer the use of internal channels (Teams / Slack), so it is transparent what is being asked for across all stakeholders.
Putting Developer Experience at the Core
Developer Experience should be the key focus for anything that the platform provides.
From a process point-of-view, you want to minimise the dependencies that end-users have on the platform team. One of the best ways of doing so is by opting for self-service instead of a ticketing approach. This provides faster resolution times, greater autonomy for the end-users and a reduced workload for the platform team. We encourage the platform to make use of GitOps, as it provides a great starting point for many of the self-service tasks, while only relying on technologies that are widely adopted by organisations.
From a "service" point-of-view, you want to make it as easy as possible for an end-user to get started. This requires great and concise “getting-started” documentation, working examples and even bootstrap scripts so adopters hit the ground running.
Keeping the Platform Nimble
Historically, platforms do not always have the best reputation. They are often associated with mandated and hard-to-use services, endless ticketing, and long lead times. Instead of enabling and accelerating end-users, they often do the opposite. We want to prevent that from happening here, and because of that, there needs to be strong product management in the platform team.
For any newly requested service, there should be a strong indication that it is worth providing. The time and cost saved should outweigh the investment. As a general rule you want to make sure that services will be used by more than 1 stakeholder.
For existing services, periodically check if they are still relevant. Services that are not used by workload teams do not add value, they only incur operational cost, hence they should be deprecated.
Next to keeping the amount of offerings as small as possible, there should also be a strong focus on building the simplest solution possible that addresses the need. A great example of this can be found in Team Topologies’ description of a TVP:
This TVP could be just a wiki page if that's all you need for your platform - it says we use this cloud provider and we only use these services from a cloud provider and here's the way we use them. That might just be a wiki page - that might be your platform.
Having a nimble platform makes it possible to keep addressing the changing needs of the end-users, and making sure that the platform is doing what it is supposed to do: empowering the organisation.
Stay tuned for the next post, where we dive into how we set up the technical foundation for a platform.