<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[DataChef]]></title><description><![CDATA[DataChef is a big data consultancy based in Amsterdam. We help modern marketing and sales teams by demystifying data and simplifying information.]]></description><link>https://blog.datachef.co</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1704892077649/5jPx7BiLp.png</url><title>DataChef</title><link>https://blog.datachef.co</link></image><generator>RSS for Node</generator><lastBuildDate>Wed, 22 Apr 2026 21:24:13 GMT</lastBuildDate><atom:link href="https://blog.datachef.co/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[How Product Organizations Scale Without Splitting Strategy from Execution]]></title><description><![CDATA[Quick recap from my previous article: In high-performing product organizations, strategy and execution aren't split between different roles. Every product team needs someone who owns both the "why" an]]></description><link>https://blog.datachef.co/how-product-organizations-scale-without-splitting-strategy-from-execution</link><guid isPermaLink="true">https://blog.datachef.co/how-product-organizations-scale-without-splitting-strategy-from-execution</guid><category><![CDATA[Product Management]]></category><category><![CDATA[Organization Design]]></category><category><![CDATA[team topologies]]></category><dc:creator><![CDATA[Davide Rovati]]></dc:creator><pubDate>Wed, 08 Apr 2026 15:05:07 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/673ca28eaf27dbd59d38eb71/be9d6136-eb48-40fa-b975-5138a377222f.jpg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p><strong>Quick recap from my</strong> <a href="https://blog.datachef.co/product-owner-product-manager-strategy-execution"><strong>previous article</strong></a><strong>:</strong> In high-performing product organizations, strategy and execution aren't split between different roles. Every product team needs someone who owns both the "why" and accountability for what is built: someone with strong commercial instinct, ownership of outcomes, and direct connection to execution.</p>
</blockquote>
<h2>"But does this scale?"</h2>
<p>Now let me address the inevitable objection: "Sure, but that integrated model might work at startup scale. Once you have hundreds of teams and complex enterprise environments, you <em>need</em> to split these responsibilities for scalability."</p>
<p>And while the scale at which you operate can influence your organizational design, a good principle is good at any size, and a bad principle is bad at any size.</p>
<p>There are several empirical examples of companies that scaled to massive size without splitting strategy from execution. Google doesn't have Product Owners. Neither does Cloudflare, nor Netflix. They scale not by fragmenting the role, but by managing scope: ensuring each Product Manager has a clear, bounded mandate where they can make meaningful impact without drowning in dependencies.</p>
<h2>A sustainable model to scale Product teams</h2>
<p>If you shouldn't split strategy from execution, how do you scale product management?</p>
<p>The answer is <strong>vertical, not horizontal.</strong> You scale by expanding scope, not by fragmenting responsibilities:</p>
<p><strong>Product Manager</strong> → Owns a single product or module with clear boundaries. Responsible for both strategy and execution. Measured on business outcomes and impact. Works with a squad sized and staffed appropriately to the scope. Following <a href="https://blog.datachef.co/fast-flow-conf-2025-team-topologies-language-of-flow">Team Topologies</a> principles, the squad must be optimized for fast flow and own the end-to-end value generation of a slice of the problem space. In startups, that slice will be very big. In huge corporates, that slice will be very small.</p>
<p><strong>Group Product Manager / Head of Product</strong> → Owns a portfolio of products or a complex product with multiple independent modules. Sets portfolio-level goals. Acts as people manager for Product Managers. Provides coordination across products and platforms.</p>
<p><strong>Director of Product / VP / CPO</strong> → Owns product organization vision, culture, tooling, and structure. Ensures product management practices scale effectively. Builds systems that empower PMs to make impact. In smaller companies who don’t need 2+ management layers, this role is usually merged with the previous one. In startups, this is usually the CTO (heading both the Engineering and Product functions).</p>
<p>In this structure, the Group PM isn't "doing strategy" while PMs "execute." Instead, the Group PM works at a higher level of abstraction—setting portfolio goals and ensuring alignment—while individual Product Managers maintain full ownership of both strategy <em>and</em> execution within their scope.</p>
<p>The Group PM asks: "Are we working on the right set of products to achieve our business objectives?"</p>
<p>The PM asks: "Is my product's roadmap delivering maximum impact on those objectives?"</p>
<p>Both are strategic. Both are connected to execution. The difference is scope, not separation of concerns.</p>
<img src="https://cdn.hashnode.com/uploads/covers/673ca28eaf27dbd59d38eb71/ac428878-4dac-436a-9330-7c72dbfa1868.png" alt="A metaphorical representation of the different scope managed by Product Managers, Group PMs, and Directors of Product." style="display:block;margin:0 auto" />

<h2>Assigning the right scope</h2>
<p><strong>The key to scaling is assigning the right scope to each Product Manager.</strong> Not so broad that they're drowning in dependencies and can't make meaningful decisions. Not so narrow that they lack autonomy or can't see how their work connects to business outcomes.</p>
<p>When you get the scope right and each PM has a clear mandate and a squad-sized team to execute it, you can scale this model from dozens to hundreds of product teams without introducing artificial role splits.</p>
<h3>What does "right scope" look like?</h3>
<ul>
<li><p><strong>Clear boundaries:</strong> The PM can make most decisions without constant cross-team coordination. There are no debates about ownership and no handoffs between teams in the value chain.</p>
</li>
<li><p><strong>Measurable impact:</strong> Success can be tied to specific business outcomes, not just feature completion.</p>
</li>
<li><p><strong>The right team:</strong> The PM works with a squad that can execute the roadmap without being stretched too thin or sitting idle. The squad’s combined skills allow the team to ship meaningful work without having to delegate work to another team.</p>
</li>
<li><p><strong>Strategic autonomy</strong>: The PM has enough freedom to experiment and adjust course based on learnings. The squad’s ability to ship new features is not constrained by heavy dependencies on other teams.</p>
</li>
</ul>
<h2>The takeaway: scaling is about scope, not splitting</h2>
<p>Scaling is possible even with a single person managing both product strategy and execution. The most important thing is setting the scope right and using the management chain to coordinate overarching topics.</p>
<p>If you’ve been following the broader “Product Owner vs Product Manager” debate from the <a href="https://blog.datachef.co/product-owner-product-manager-strategy-execution">previous article</a>, it can be tempting to treat scaling as a question of titles and role boundaries.</p>
<p>But the pattern underneath is the same: organizations split “strategy” from “execution” when they’re trying to compensate for a lack of true outcome ownership.</p>
<p>So the leadership takeaway is this: <strong>companies get the product management they incentivize.</strong> If you measure people on throughput, compliance to process, and “staying on plan,” you’ll get backlog managers and ceremony owners, regardless of whether you call them PMs or POs. If you give clear scope and accountability for impact, you’ll get product leaders who can hold the whole loop: problem → delivery → learning → outcomes.</p>
<p>When you get <em>that</em> right—scope and incentives aligned—your product organization can scale to hundreds of teams without ever needing to split product strategy from execution.</p>
]]></content:encoded></item><item><title><![CDATA[Legacy Migration Starts with Understanding, not Inventory]]></title><description><![CDATA[The Default Playbook
Legacy migration has an almost universal playbook:

Step 1 - Asset Discovery: Make an export of all assets in the environment to see what is there to migrate. The output is usuall]]></description><link>https://blog.datachef.co/legacy-migration-starts-with-understanding-not-inventory</link><guid isPermaLink="true">https://blog.datachef.co/legacy-migration-starts-with-understanding-not-inventory</guid><dc:creator><![CDATA[Shahin]]></dc:creator><pubDate>Tue, 07 Apr 2026 12:56:59 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/6193e4c293892e4586936e3d/2bd9ce90-ebf7-4024-b6d8-0bcced7d1dde.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>The Default Playbook</h2>
<p>Legacy migration has an almost universal playbook:</p>
<ul>
<li><p><strong>Step 1 - Asset Discovery:</strong> Make an export of all assets in the environment to see what is there to migrate. The output is usually a massive spreadsheet or dashboard. The leadership sees the number and it becomes the anchor: We have X thousand resources to migrate.</p>
</li>
<li><p><strong>Step 2 - Categorize:</strong> Tag the assets based on their domain, owner and the type of migration they potentially require. This is usually done manually.</p>
</li>
<li><p><strong>Step 3 - Assign:</strong> Now that we know what we have, and who owns it, it's time to assign each slice to a team and say: These are yours. Assess them, decide what to do with each one, and report back.</p>
</li>
<li><p><strong>Step 4 - Teams Investigate and Act:</strong> Each team is expected to review their assigned assets, determine what's still needed, plan the migration, rewrite what's valuable, and flag the rest for deletion.</p>
</li>
<li><p><strong>Step 5 - Track and Report Progress:</strong> A program manager tracks completion rates. Dashboards show how many assets have been categorized, how many migrated, how many deleted, and the progress is measured as a percentage of the original inventory.</p>
</li>
</ul>
<h2>The Hidden Assumption</h2>
<p>Every step in this playbook sounds reasonable on its own. Together, they rest on assumptions that rarely hold:</p>
<h3>Every asset has an owner</h3>
<p>The plan assumes that for each resource, someone in the organization knows what it is, why it exists, and whether it still matters. In practice, organic environments accumulate resources that outlive the teams or individuals who created them. The owner left, the project ended, but the assets are running and getting billed.</p>
<h3>Assets map to products or business domains</h3>
<p>The plan assumes you can draw lines from resources to business capabilities:</p>
<ul>
<li><p>these tables belong to marketing analytics.</p>
</li>
<li><p>those belong to finance reporting.</p>
</li>
<li><p>...</p>
</li>
</ul>
<p>In reality, only a small percentage of assets map cleanly to such groups. Most are organized around pipelines, ad-hoc SQL, and intermediate computation steps, not around products.</p>
<h3>Inventory equals understanding</h3>
<p>This is the deepest assumption. Listing everything feels like progress. But any environment painful enough (in cost or maintenance) to justify migration has accumulated significant scale. The inventory will contain hundreds of thousands of assets with no inherent grouping. It feels like the first logical step. It's actually the first illusion of control.</p>
<h3>Teams can self-serve classification</h3>
<p>The plan assumes you can hand teams a filtered list and they'll sort it out: keep, migrate, delete. But it requires:</p>
<ol>
<li><p>Teams actually recognize the assets. Which usually means reverse-engineering their history.</p>
</li>
<li><p>The rest of the organization, dependent on those resources, holds still. But what if they change their process during the migration, and with it, their requirements?</p>
</li>
</ol>
<h3>Effort scales with volume</h3>
<p>The intuition is: twice as many assets, twice as much work. But in reality effort scales with ambiguity. A thousand well-structured, well known assets might be easier to migrate than a handful of orphaned, unnamed ones. The cost is in the investigation and understanding, not the count.</p>
<h2>What You Actually Find</h2>
<p>I recently faced this exact situation, and the reality was eye-opening. A data warehouse that had accumulated tables, data pipelines, and scheduled queries over the past 10 years. The tables alone numbered over half a million.</p>
<p>No reasonable grouping would make that number workable. Not even if every technical member focused solely on migration and parked all other work.</p>
<p>So I asked a different question: do we really need to migrate all of this? Asking teams wasn't an option, for all the reasons above. But I could observe the environment directly. Are any of these tables actually being read?</p>
<p>And the result was surprising. Over 99% of the tables hadn't been accessed in the past 90 days. A large portion had never been accessed at all. Of those that were accessed, only a small percentage showed consistent, ongoing usage.</p>
<p>We stopped asking teams to review the full inventory. The migration was not about moving half a million assets to a new platform. It was about finding the ones that were still alive.</p>
<h2>Why Understanding IS the Migration</h2>
<p>The default playbook frames migration as logistics: X thousand assets need to move from A to B, and progress is the percentage that have moved. This treats every asset as a unit of work. But the real unit of work is not the asset. It is the decision: does this asset still matter?</p>
<p>That reframing changes the shape of the project. A logistics project scales with volume or basically twice the assets, twice the effort. A decision project scales with ambiguity. And ambiguity is not evenly distributed. In our case, one query against 90 days of usage data made the decision for 99% of the environment. No team meetings, no reverse-engineering, no spreadsheets. The remaining 1% was the only part that needed human judgment at all.</p>
<p>This is why inventory-first feels right but leads to pointless activities. An inventory gives leadership a number they can put on a slide: 500,000 assets to migrate. But that number tells you nothing about how many decisions you actually face. It treats an orphaned table untouched for five years the same as a pipeline feeding a daily business report. Usage data makes them obviously different. The inventory makes them equal.</p>
<p>The default playbook also misidentifies what a legacy environment is. It assumes a system something designed, something you can pick up and relocate. But a ten-year-old data warehouse is not a system. It is an accumulation: layers of decisions made by people who are no longer around, solving problems that may no longer exist. You do not relocate an accumulation. You find what is still alive inside it and build forward from there. Martin Fowler describes the general pattern as a <a href="https://martinfowler.com/bliki/StranglerFigApplication.html">Strangler Fig</a> the new system grows around the old one while the old one atrophies. But in a mature legacy environment, most of the atrophy has already happened. The tables are already dead. The pipelines already stopped. You just haven't confirmed it yet.</p>
<p>Understanding the environment does not prepare you for the migration. It is the migration. Once you know what is alive, the rest is cleanup.</p>
<h2>An Alternative: Quarantine Before You Classify</h2>
<p>The alternative requires a different starting point: let usage tell you what matters, instead of asking teams to figure it out from a spreadsheet.</p>
<p>Start with what you can act on immediately. Usage data (who accessed what, and when) separates the living parts of the environment from the dead ones, without requiring anyone to understand what each asset is.</p>
<p>Once you have that separation, introduce a quarantine window. If a resource hasn't been accessed for a defined period (say 90 days), block access to it and notify the teams. If nobody requests restoration within another 90 days, back it up and delete it. If someone does need it, you've just identified a genuinely valuable asset: route it to the appropriate team and start a proper migration lifecycle for it.</p>
<p>The important assets identify themselves. Instead of teams sifting through hundreds of thousands of items, the environment surfaces its own priorities through actual usage. The noise falls away on its own.</p>
<p>Postpone permanent deletion as long as practically possible. Data loss is the one mistake you can't reverse. Quarantine simulates deletion before you commit to it, and catches seasonal patterns that a 90-day snapshot would miss.</p>
<p>With this model, you can report meaningful progress monthly: deadwood percentage, reduction in active resources, cleanup projections. Metrics an executive can act on, without requiring teams to manually work through irrelevant assets.</p>
<h2>The Question to Ask About Your Own Environment</h2>
<p>Before committing to the default playbook, ask one question about your legacy environment: what percentage of your assets have been accessed in the last 90 days?</p>
<p>If the answer is low, and it may be far lower than you expect, you are not facing an inventory problem. You are facing an archaeology problem. Archaeology does not start with a catalog. It starts with a question: what here is still alive?</p>
]]></content:encoded></item><item><title><![CDATA[How we use Claude to write code]]></title><description><![CDATA[Claude is a tool. It helps us think faster and write code faster. But the developer is still in charge. The goal is simple: better code, clear thinking, and full responsibility.
This document explains]]></description><link>https://blog.datachef.co/how-we-use-claude-to-write-code</link><guid isPermaLink="true">https://blog.datachef.co/how-we-use-claude-to-write-code</guid><category><![CDATA[claude]]></category><category><![CDATA[AI]]></category><category><![CDATA[#ai-tools]]></category><dc:creator><![CDATA[Farbod Ahmadian]]></dc:creator><pubDate>Thu, 26 Mar 2026 14:11:38 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/6621274e5f2317db9857f023/2b69739c-f0b7-4a60-ba02-d77aece62f51.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Claude is a tool. It helps us think faster and write code faster. But the developer is still in charge. The goal is simple: better code, clear thinking, and full responsibility.</p>
<p>This document explains how we use Claude at DataChef.</p>
<hr />
<h2>1. Always commit your <code>claude.md</code></h2>
<p>Every project that uses Claude must have a <code>claude.md</code>.</p>
<p>This file explains how Claude should behave in the project.</p>
<p>It should include things like:</p>
<ul>
<li><p>coding style</p>
</li>
<li><p>architecture rules</p>
</li>
<li><p>libraries we prefer</p>
</li>
<li><p>things we never do</p>
</li>
<li><p>how tests should look</p>
</li>
<li><p>how commits should look</p>
</li>
</ul>
<p>Why this matters:</p>
<p>Claude works best when it has context. Without context it guesses. With context it becomes consistent.</p>
<p>Your <code>claude.md</code> is the memory of the project.</p>
<p>Commit it to the repository so everyone works with the same rules.</p>
<hr />
<h2>2. Put your important prompts in the pull request</h2>
<p>When Claude helps write code, the reviewer should know how that code was created.</p>
<p>If a prompt had a big influence on the result, include it in the pull request description.</p>
<p>This helps reviewers understand:</p>
<ul>
<li><p>the intent of Claude</p>
</li>
<li><p>the reasoning of Claude</p>
</li>
<li><p>what Claude was asked to do</p>
</li>
</ul>
<p>It also makes it easier to reproduce or improve the result later.</p>
<p>Transparency builds trust.</p>
<hr />
<h2>3. Teach Claude not to sound like AI</h2>
<p>Add a new <a href="https://support.claude.com/en/articles/12512176-what-are-skills">skill</a> for you CLaude to not sound like AI and avoid these patterns:</p>
<p><a href="https://en.wikipedia.org/wiki/Wikipedia:Signs%5C_of%5C_AI%5C_writing">https://en.wikipedia.org/wiki/Wikipedia:Signs\_of\_AI\_writing</a></p>
<p>AI text often looks like:</p>
<ul>
<li><p>overly formal language</p>
</li>
<li><p>repetitive structures</p>
</li>
<li><p>too many bullet points</p>
</li>
<li><p>vague explanations</p>
</li>
<li><p>generic transitions</p>
</li>
</ul>
<p>We want writing that sounds human and direct.</p>
<p>Short sentences. Clear thinking. No filler.</p>
<hr />
<h2>4. Never use dangerously skip permissions for Production</h2>
<p>Do not start Claude with <code>--dangerously-skip-permissions</code></p>
<p>Always review what Claude wants to do.</p>
<p>Read each prompt and tool request before approving it.</p>
<p>You should always know:</p>
<ul>
<li><p>what files Claude reads</p>
</li>
<li><p>what files Claude changes</p>
</li>
<li><p>what commands it runs</p>
</li>
</ul>
<hr />
<h2>5. Claude writes code. You own the code.</h2>
<p>Claude can generate code, but you are responsible for it.</p>
<p>Always:</p>
<ul>
<li><p>read the code</p>
</li>
<li><p>understand the code</p>
</li>
<li><p>question the code</p>
</li>
</ul>
<p>If you cannot explain a change to another developer, do not merge it.</p>
<hr />
<h2>6. Ask for small steps</h2>
<p>Do not ask Claude to build a whole system in one prompt.</p>
<p>Work in small steps.</p>
<p>Example flow:</p>
<ol>
<li><p>ask Claude to design the approach</p>
</li>
<li><p>review the plan</p>
</li>
<li><p>implement one part</p>
</li>
<li><p>review again</p>
</li>
<li><p>continue</p>
</li>
</ol>
<p>Small steps reduce mistakes.</p>
<hr />
<h2>7. Prefer editing over generating</h2>
<p>If a file already exists, ask Claude to improve or refactor it.</p>
<p>Do not ask it to rewrite everything.</p>
<p>Large rewrites often introduce hidden problems.</p>
<p>Good prompts look like:</p>
<ul>
<li><p>"simplify this function"</p>
</li>
<li><p>"remove duplication"</p>
</li>
<li><p>"add tests for this logic"</p>
</li>
<li><p>"explain the edge cases"</p>
</li>
</ul>
<hr />
<h2>8. Always ask for tests</h2>
<p>If Claude writes logic, it should also suggest tests.</p>
<p>Tests help verify that the code does what we expect.</p>
<p>Good prompts include:</p>
<ul>
<li><p>"write unit tests for this function"</p>
</li>
<li><p>"add edge cases"</p>
</li>
<li><p>"show failure scenarios"</p>
</li>
</ul>
<p>You can add this as part of your <a href="http://CLAUDE.md">CLAUDE.md</a> to make sure you never forget.</p>
<hr />
<h2>9. Ask Claude to explain its reasoning</h2>
<p>Before accepting a change, ask Claude questions.</p>
<p>Examples:</p>
<ul>
<li><p>why is this approach better</p>
</li>
<li><p>what edge cases exist</p>
</li>
<li><p>what could break</p>
</li>
<li><p>what are the performance risks</p>
</li>
</ul>
<p>Claude is good at surfacing hidden issues when asked directly.</p>
<hr />
<h2>10. Keep prompts simple and direct</h2>
<p>Claude works best with clear instructions.</p>
<p>Bad prompt:</p>
<p>"Can you improve this in a robust scalable architecture that follows best practices?"</p>
<p>Better prompt:</p>
<p>"Reduce complexity in this function. Do not change the behavior."</p>
<p>Clarity produces better results.</p>
<hr />
<h2>11. Use Claude as a thinking partner</h2>
<p>Claude is not just for writing code.</p>
<p>It is useful for:</p>
<ul>
<li><p>debugging</p>
</li>
<li><p>reading unfamiliar code</p>
</li>
<li><p>designing APIs</p>
</li>
<li><p>writing migrations</p>
</li>
<li><p>reviewing pull requests</p>
</li>
<li><p>explaining errors</p>
</li>
</ul>
<p>Treat it like a second developer who thinks fast.</p>
<p>But remember: you are the final reviewer.</p>
<hr />
<h2>12. Leave the codebase better</h2>
<p>Every Claude assisted change should improve the codebase.</p>
<p>Examples:</p>
<ul>
<li><p>clearer naming</p>
</li>
<li><p>better structure</p>
</li>
<li><p>fewer lines</p>
</li>
<li><p>stronger tests</p>
</li>
<li><p>simpler logic</p>
</li>
</ul>
<p>Speed is helpful. Quality is the goal.</p>
<hr />
<h2>Final rule</h2>
<p>Claude is powerful.</p>
<p>But good engineering still comes from:</p>
<ul>
<li><p>careful thinking</p>
</li>
<li><p>good reviews</p>
</li>
<li><p>clear communication</p>
</li>
<li><p>responsibility for the code</p>
</li>
</ul>
<p>Use Claude to move faster.</p>
<p>Do not use it to stop thinking.</p>
]]></content:encoded></item><item><title><![CDATA[Introducing DANA: Conversational AI for Legacy System Modernization]]></title><description><![CDATA[Imagine your data expert could answer every team's questions at once, without a single meeting. That's DANA. Built by DataChef, DANA is a conversational AI that learns from your domain experts, unders]]></description><link>https://blog.datachef.co/introducing-dana-conversational-ai-for-legacy-system-modernization</link><guid isPermaLink="true">https://blog.datachef.co/introducing-dana-conversational-ai-for-legacy-system-modernization</guid><dc:creator><![CDATA[Alireza Ebrahimkhani]]></dc:creator><pubDate>Mon, 09 Mar 2026 13:08:17 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/689894d630ab7b2b509d0ee7/02218bf5-2cd6-44a0-a8c5-ed3ee4dfe8c8.jpg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Imagine your data expert could answer every team's questions at once, without a single meeting. That's DANA. Built by DataChef, DANA is a conversational AI that learns from your domain experts, understands your schemas, and gives your teams instant, reliable answers about your data: how it maps, what it means, and how to use it.</p>
<p>No more bottlenecks. No more waiting. Just ask DANA (Data and AI knowledge Assistant)</p>
<h2>The Problem: One Expert, Ten Teams, a Hundred Questions</h2>
<p>Large organizations undergoing system migrations or platform integrations know this scenario well. A domain expert owns the knowledge about how the data model works. Business analysts across multiple downstream teams need answers to move forward:</p>
<ul>
<li><p><em>"I'm consuming a field from the old system. What's the equivalent in the new data model?"</em></p>
</li>
<li><p><em>"Can this field be empty, or is it always required?"</em></p>
</li>
<li><p><em>"What does this status code mean and how should I use it?"</em></p>
</li>
<li><p><em>"What are the right fields to get the official name at different hierarchy levels?"</em></p>
</li>
</ul>
<p>These are the kinds of questions that come up dozens of times a day during a migration. The answers exist, but they're scattered across Confluence pages, Excel mappings, design documents, and most critically, in the domain expert’s head.</p>
<p>The architect spends their days answering repetitive questions instead of doing architectural work. Analysts wait hours or days for responses. Migration timelines slip. And when the architect is unavailable, progress stops entirely.</p>
<blockquote>
<p>💡This is not a tooling problem. It's a knowledge bottleneck. And you can't solve it by buying another documentation tool that nobody will maintain.</p>
</blockquote>
<h2>Why Existing Tools Don't Solve This</h2>
<p>You might think: "We already have a data catalog" or "We documented everything in Confluence." The reality is that most organizations have tried at least one of these approaches, and they all fall short in the same way.</p>
<p><strong>Data catalogs</strong> are great for storing metadata, and many support field descriptions, glossary links, and even column-level lineage. But that information still needs to be written and maintained by someone. And when it comes to cross-system mappings during a migration, how a field in the old model translates to the new one, under what conditions, and with what edge cases, catalogs typically don't capture that level of contextual knowledge.</p>
<p><strong>Documentation wikis</strong> (Confluence, SharePoint, Notion) become outdated the moment they're written. Maintaining them requires manual effort that nobody has time for. After a few months, teams stop trusting them and go back to asking the domain expert directly.</p>
<p><strong>Schema extraction tools</strong> can pull table structures and column types from a database. But <code>CUST_FLG_02</code> is still just <code>CUST_FLG_02</code>. Without a business context, raw schema data is only marginally useful.</p>
<p><strong>Documentation sprints</strong> work in theory: you get everyone in a room, write everything down, and publish it. In practice, the knowledge is stale before the sprint is over, and you've burned weeks of your domain expert's time in the process.</p>
<p>The common thread? All of these approaches treat knowledge capture as a one-time event. But domain knowledge evolves constantly, especially during migrations. You need something that learns continuously and stays up to date.</p>
<h2>What is DANA?</h2>
<p>DANA is a conversational AI that sits between your domain experts and your teams. Instead of trying to replace the architect, DANA <strong>learns from them</strong> through natural conversation, and then makes that knowledge available to everyone on demand.</p>
<p>Think of it as a colleague who has perfect memory. The domain expert explains something once, and DANA remembers it, structures it, and can repeat it accurately to anyone who asks.</p>
<p>But unlike a wiki page, DANA doesn't just sit there waiting to go stale. It actively keeps itself up to date. When team members ask questions that reveal gaps, DANA flags those gaps and prompts the domain expert to fill them. When new schemas are detected or data models change, DANA identifies what's outdated and initiates a review. The architect doesn't need to remember to update DANA, DANA tells the domain experts what needs attention.</p>
<p>DANA combines three things that are usually separate: <strong>schema extraction</strong> (understanding the technical structure of your data), <strong>knowledge enrichment</strong> (adding business context and meaning), and a <strong>conversational AI interface</strong> (so humans can teach it and query it naturally). Together, these create a knowledge base that isn't just a static document, but a living system that grows smarter with every interaction.</p>
<h2>How DANA Works</h2>
<h3>1. The Architect Teaches, DANA Remembers</h3>
<p>A data expert talks to DANA the same way they'd talk to a colleague. They explain how the data model works, clarify field meanings, describe mappings between old and new systems, and flag edge cases. DANA captures these explanations, structures them, and stores them as validated knowledge.</p>
<p>No documentation writing. No spreadsheet filling. Just conversation.</p>
<p>Here's what a session might look like:</p>
<blockquote>
<p><strong>Architect:</strong> "The product code field in the legacy system maps to a different field name in the new domain model. It's always populated for active records, but can be null for discontinued items."</p>
<p><strong>DANA:</strong> "Got it. I've recorded the mapping between those two fields. Populated for active records, nullable for discontinued. Is there a specific status field I should check to determine if a record is discontinued?"</p>
<p><strong>Architect:</strong> "Yes, there's a lifecycle status field. If it's set to 'DISCONTINUED', the product code may be null."</p>
<p><strong>DANA:</strong> "Understood. I've linked this to the lifecycle status field as a dependency. <strong>Changed:</strong> product code mapping documented with nullability rule. <strong>Gaps:</strong> I still don't have mappings for 3 other attributes in the pricing group. <strong>Next:</strong> Want to work on those, or switch to a different area?"</p>
</blockquote>
<p>DANA doesn't just record what you say. It asks follow-up questions, connects related facts, tracks what's still missing, and tells you what to work on next.</p>
<blockquote>
<p>Rather than uploading large volumes of raw internal data, the architect teaches DANA the concepts and mappings it needs. This keeps sensitive client data secure while still building a rich, contextual knowledge base.</p>
</blockquote>
<h3>2. Schema Understanding</h3>
<p>DANA also ingests schemas directly from your databases and warehouses. It connects to systems like Oracle, SQL Server, Teradata, and PostgreSQL, and extracts the technical picture: tables, columns, data types, primary and foreign keys, stored procedures, and views.</p>
<p>But DANA goes a step further than typical schema extraction. It also analyzes <strong>query execution history</strong> to understand which parts of the data are actually in use. Which tables are queried daily? Which ones haven't been touched in years? Which columns show up in the most joins?</p>
<p>This usage profiling helps teams prioritize. Instead of trying to document every table in a legacy system with thousands of objects, you start with the ones that actually matter to the business.</p>
<h3>3. The Team Asks, DANA Answers</h3>
<p>Once DANA has learned from the architect, anyone on the team can ask questions directly through the conversational AI interface. And they get real answers, not just links to documents.</p>
<p>Here's what that looks like in practice:</p>
<blockquote>
<p><strong>Data Engineer:</strong> "I'm consuming the product code from the legacy system. What is the equivalent attribute in the new domain model?"</p>
<p><strong>DANA:</strong> "The equivalent field in the new model is the product identifier. Note: this field can be null for discontinued records (based on the lifecycle status). For active records, it's always populated.</p>
<p><em>Source:</em></p>
<ul>
<li><p><em>domain architect name</em></p>
</li>
<li><p>relevant document links"</p>
</li>
</ul>
</blockquote>
<p>DANA provides answers with references to where the knowledge came from.</p>
<p>Domain experts teach DANA what they know. Downstream teams consume that knowledge on demand. DANA tracks what it has learned, what is still missing, and what needs validation, creating a living knowledge base that improves with every interaction.</p>
<h3>4. DANA Evolves With Your Data</h3>
<p>DANA's knowledge isn't a snapshot, it evolves as your data model evolves. Every question from the team that can't be fully answered becomes a signal: a gap that DANA tracks and raises with the architect in the next session. When schemas change, or new source systems are added, DANA detects what's new and flags what needs to be reviewed or remapped. This creates a continuous feedback loop: the team's questions drive the architect's priorities, and the architect's answers expand what the team can self-serve. Over time, DANA covers more ground, answers more questions, and requires less input from the architect.</p>
<hr />
<h2>What Your Teams Get</h2>
<ul>
<li><p><strong>Instant answers</strong>: No more waiting for the architect's calendar to open up. Teams get reliable, referenced answers through the conversational AI interface whenever they need them.</p>
</li>
<li><p><strong>Consistent knowledge</strong>: Everyone gets the same validated information. When the architect clarifies a mapping with DANA, that clarification is immediately available to all teams.</p>
</li>
<li><p><strong>Reduced key-person risk</strong>: Domain knowledge is no longer locked in one person's head. It's captured, structured, and accessible, even when key people move on.</p>
</li>
<li><p><strong>Faster migrations</strong>: Impact analysis that used to take weeks of back-and-forth can happen in days when teams can self-serve.</p>
</li>
<li><p><strong>Data contracts</strong>: DANA generates YAML-based data contracts following the <a href="http://datacontracts.com">datacontracts.com</a> specification. A data contract defines the structure, meaning, ownership, and rules of a dataset in a machine-readable format. Think of it as an API contract, but for data. These contracts can be versioned in Git, published to your data catalog, and used to enforce data quality standards across teams.</p>
</li>
</ul>
<hr />
<h2>Who is DANA For?</h2>
<p>DANA is built for organizations where data knowledge is the bottleneck:</p>
<table>
<thead>
<tr>
<th>Scenario</th>
<th>How DANA Helps</th>
</tr>
</thead>
<tbody><tr>
<td><strong>System Migrations</strong></td>
<td>Teams understand field mappings and model differences before they start building, not after. DANA captures the "why" behind the mapping, not just the "what."</td>
</tr>
<tr>
<td><strong>Platform Onboarding</strong></td>
<td>Data contracts and catalog entries are generated from validated knowledge, not guesswork. New platforms get clean, documented data from day one.</td>
</tr>
<tr>
<td><strong>Data Governance</strong></td>
<td>Schema documentation is explainable, reviewable, and traceable to its source. Auditors can see where every description came from.</td>
</tr>
<tr>
<td><strong>Knowledge Preservation</strong></td>
<td>Institutional knowledge is captured through conversational AI before it walks out the door. When key people move on, the knowledge stays.</td>
</tr>
<tr>
<td><strong>Scaling Data Teams</strong></td>
<td>New team members get up to speed by asking DANA instead of requiring hours of the architect's time. Onboarding becomes self-service.</td>
</tr>
</tbody></table>
<hr />
<h2>Conclusion</h2>
<p>Your domain experts shouldn't be answering the same questions over and over. And your teams shouldn't be waiting in line for answers that already exist somewhere.</p>
<p>DANA turns conversations into knowledge, and knowledge into self-service. It learns continuously, stays consistent, and scales to every team that needs it.</p>
<p>That's conversational AI applied to data. And it changes how organizations approach legacy modernization.</p>
<p>Interested in DANA or facing a similar challenge? Reach out to us at <a href="https://datachef.co/contact">datachef.co/contact</a> or connect with us on <a href="https://www.linkedin.com/company/datachefco">LinkedIn</a>.</p>
<p>#conversational-ai #data-engineering #data-contracts #legacy-modernization</p>
]]></content:encoded></item><item><title><![CDATA[The Missing Right Side of Your dbt DAG]]></title><description><![CDATA[If you maintain a dbt project long enough, you end up with a familiar problem: you can see what’s upstream of a model, but it is surprisingly hard to answer what is downstream in the real world. Which]]></description><link>https://blog.datachef.co/the-missing-right-side-of-your-dbt-dag</link><guid isPermaLink="true">https://blog.datachef.co/the-missing-right-side-of-your-dbt-dag</guid><category><![CDATA[data-engineering]]></category><category><![CDATA[documentation]]></category><category><![CDATA[dbt]]></category><dc:creator><![CDATA[Boris Morel]]></dc:creator><pubDate>Mon, 09 Mar 2026 10:18:43 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/68baa704fe509259a5bd6efa/0142cc58-7dde-47f2-b0be-f7b203c2140b.jpg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you maintain a dbt project long enough, you end up with a familiar problem: you can see what’s upstream of a model, but it is surprisingly hard to answer what is <em>downstream</em> in the real world. Which dashboards, reports, ML jobs, or extracts depend on this thing, and who owns them? That uncertainty makes refactors risky, and it makes the impact of data incidents much harder to assess quickly.</p>
<p>dbt <strong>exposures</strong> are a lightweight way to document those consumers <em>as code</em>, right next to the models and sources they depend on. They turn “tribal knowledge” into something reviewable, queryable, and much harder to accidentally forget.</p>
<h1><strong>Why does “docs outside the repo” drift</strong></h1>
<p>Those who have worked with me might know that I take a very careful approach when it comes to documentation, especially when it is decoupled from the codebase. Reasons include:</p>
<ul>
<li><p>When someone is reading a piece of code, they might not think of looking elsewhere (Confluence, SharePoint) for extra insight.</p>
</li>
<li><p>You can be certain that shortly after the documentation is published, it will become outdated because code contributors forgot or did not prioritize documentation updates.</p>
</li>
</ul>
<p>On the other hand, providing context about logic someone went through the trouble to implement provides enormous value. In the case of dbt models: why do we have a model? Who or what consumes it? What are the consequences if it breaks, or if we remove or alter it? Who should we reach out to before we clean up a legacy model?</p>
<p>An intermediate model (i.e. which is upstream of another model) has an obvious purpose. Its usage is self-documented: just look at its children models. But what about all the models <em>dangling</em> at the far right end of the DAG in your dbt project?</p>
<p>Most users are aware that models and data sources can be enriched with documentation and descriptions. But dbt also offers a powerful way to document the <em>usage</em> of models as code. This feature may not be well known, so in this post I will explain it and (hopefully) convince more dbt practitioners to adopt it.</p>
<h1><strong>Defining exposures in YAML</strong></h1>
<p>In your dbt project, you can add one or more YAML files that declare <strong>exposures</strong>. Let’s first look at how to do it (slightly modified example from the [official documentation](<a href="https://docs.getdbt.com/docs/build/exposures">https://docs.getdbt.com/docs/build/exposures</a>)):</p>
<pre><code class="language-yaml">exposures:

  - name: weekly_jaffle_metrics
    label: Jaffles by the Week
    type: dashboard
    maturity: high
    url: https://bi.tool/dashboards/1
    description: &gt;
      Did someone say "exponential growth"?

    depends_on:
      - ref('fct_orders')
      - ref('dim_customers')

    owner:
      name: Callum McData
      email: data@jaffleshop.com
</code></pre>
<p>Once you add exposures and generate docs <code>dbt docs generate</code>), they appear as first-class entities in the dbt documentation website: you can browse them, see their descriptions, owners, and links (for example, to the dashboard), and navigate their lineage to the upstream models and sources they depend on. In other words, they show up alongside models and sources in the docs UI, instead of living in a separate wiki.</p>
<h1><strong>Keeping documentation from drifting</strong></h1>
<p>An advantage of dbt exposures over, for example, a Confluence page is that they live close to the code. If a developer comes across a model, they can find the exposure easily because it sits in the same repository as the models. Thanks to this proximity alone, it is less likely to go out of sync. But another characteristic of exposures makes them even less likely to drift away from reality:</p>
<p>If a model referenced by an exposure (in this case <code>fct_orders</code> or <code>dim_customers</code>) is deleted or renamed, the dbt project won't be able to compile:</p>
<pre><code class="language-shell">% dbt test

Encountered an error:

Compilation Error

Exposure 'exposure.weekly_jaffle_metrics' (models/exposures.yml) depends on a node named 'fct_orders' which was not found
</code></pre>
<p>This way, the contributor will be alerted that this model is being used and that care should be taken before going ahead with the change. This is because exposures are nodes in the graph of the dbt project, just like models and sources. If you reference a missing node, the project can’t compile.</p>
<p>So far, this probably sounds like exposures “solve” the problem. They help a lot, but they are not a silver bullet.</p>
<h1><strong>Limitations (and what exposures are not)</strong></h1>
<p>Exposures are useful, but it helps to be explicit about their limits:</p>
<ul>
<li><p>They do not catch all breaking changes. A column rename or semantic logic change can still break a dashboard even if the exposure compiles.</p>
</li>
<li><p>They document <em>known</em> consumers. The absence of an exposure does <strong>not</strong> prove that a model is unused. It may simply mean the consumer has not been documented yet.</p>
</li>
</ul>
<p>In other words, the developer’s vigilance is still required. In the next sections, I will show how exposures <em>augment</em> that vigilance by letting you query the graph for impact analysis (starting from an upstream model or source) and for investigations that start from a consumer (like a dashboard).</p>
<h1><strong>Impact analysis: find downstream consumers</strong></h1>
<p>If a source or model with many nodes downstream has an issue, or needs to be modified, we can query the dbt project’s graph to find all exposures downstream of that node, and therefore potentially affected. This gives you the impact of a change, plus a list of consumers and contact people to reach out to.</p>
<p>Let’s say the <code>raw_payments</code> source table (here a seed) has an issue, a <code>raw_payments+</code> query shows the whole lineage down to the affected exposures:</p>
<img src="https://cdn.hashnode.com/uploads/covers/68baa704fe509259a5bd6efa/636e0750-6eae-44b0-9a60-1a59a32b4de7.png" alt="" style="display:block;margin:0 auto" />

<p>If you prefer the CLI, this is a command that lists all exposures downstream of the <code>raw_payments</code> node:</p>
<pre><code class="language-shell">% dbt ls --select raw_payments+ --resource-type exposure
exposure:jaffle_shop.customer_health_dashboard
exposure:jaffle_shop.customer_segmentation_ml
exposure:jaffle_shop.weekly_revenue_report
</code></pre>
<h1><strong>When the alert comes from a dashboard</strong></h1>
<p>Not every incident starts at the source. Sometimes a business user reports that a KPI in a dashboard looks off, or that a scheduled report has stopped refreshing. In that moment, the first question is usually: what are all the upstream models and sources that could explain this symptom?</p>
<p>If you declare dashboards (and other downstream assets) as exposures, they become a natural entry point for that investigation. Because each exposure lists the dbt nodes it depends on, you can quickly get an initial shortlist of models to inspect first.</p>
<p>For example, to find the nodes declared as dependencies of a given exposure (can also be used in dbt docs):</p>
<pre><code class="language-shell"># List the dependencies of a specific exposure

% dbt ls --select +exposure:jaffle_shop.customer_health_dashboard
</code></pre>
<p>From there, you can keep traversing upstream, or combine it with other selectors to narrow your search. The key point is that you can start from “the thing that looks wrong” and move left through the graph, instead of guessing where to begin.</p>
<h1><strong>Review workflow: keep consumers involved</strong></h1>
<p>When introducing or modifying an exposure file, it's crucial to open the pull request like any other code change and request as a reviewer an owner of the consumer that's described in the exposure file. That way, that person can verify if we understood well how the model is used and that the contact information is correct.</p>
<p>This review step is also the best moment to make the <code>description</code> field truly useful. The consumer can help make it complete by adding the business meaning (what question this dashboard answers), important caveats (filtering rules, known limitations, edge cases), and freshness expectations (how often it should refresh, what “stale” means, and what to do when it is late). Over time, this turns exposures into a lightweight shared contract between producers and consumers.</p>
<p>This can be enforced with GitHub if exposures are defined together in a file for a group of users (a team): set that team as a codeowner on the file and their approval will be required if it’s modified.</p>
<h1><strong>Wrap-up</strong></h1>
<p>I hope this post has convinced you that exposures can benefit your dbt setup. They help document model usage, catch some mistakes early, and get a complete view of impacted consumers in case of data issues or code changes, all while staying up to date.</p>
<p>If you want a low-effort way to get started, pick one critical dashboard (the one people will notice within minutes if it breaks) and add a single exposure for it.</p>
<ul>
<li><p>Link to the dashboard.</p>
</li>
<li><p>Add a real owner (a person or a team).</p>
</li>
<li><p>Write a <code>description</code> that captures the business meaning, caveats, and what “fresh” means.</p>
</li>
</ul>
<p>Once that is in place, you have a reliable starting point for impact analysis, and for investigations that begin from a consumer (“this dashboard looks wrong”).</p>
]]></content:encoded></item><item><title><![CDATA[Don't Split Product Strategy from Execution]]></title><description><![CDATA[Mention the Product Owner role to someone from a startup or scale-up background, and you'll likely get a blank stare. They'll probably google it later to figure out what you meant.
On the contrary, pe]]></description><link>https://blog.datachef.co/product-owner-product-manager-strategy-execution</link><guid isPermaLink="true">https://blog.datachef.co/product-owner-product-manager-strategy-execution</guid><category><![CDATA[Product Management]]></category><category><![CDATA[productowner]]></category><category><![CDATA[product strategy]]></category><dc:creator><![CDATA[Davide Rovati]]></dc:creator><pubDate>Fri, 20 Feb 2026 15:13:19 GMT</pubDate><enclosure url="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/673ca28eaf27dbd59d38eb71/a2a82a69-ec01-4f12-9648-9452f7fd34ea.jpg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Mention the Product Owner role to someone from a startup or scale-up background, and you'll likely get a blank stare. They'll probably google it later to figure out what you meant.</p>
<p>On the contrary, people who've spent their entire career in large enterprises treat it as second nature. “You can't have a product team without a Product Owner!”, I heard several times. They might have more doubts about the Product Manager role, though, assuming they've ever worked with one.</p>
<p>So, is this just another article debating the differences between the two roles and whether you need one or both? Not quite. For me, <strong>the Product Manager vs Product Owner debate is asking the wrong question.</strong></p>
<p>Yes, there's plenty of literature on the definition for each role and the right terminology to use. But here's the thing: in any organization that draws a hard line between these roles and wastes time debating where to draw it, <strong>the distinction itself is a symptom of organizational dysfunction, not a solution.</strong></p>
<p>The real question isn't "what's the difference between these roles?" It's "why do so many enterprises struggle to build truly product-driven organizations?"</p>
<h2>How we got here</h2>
<p>Let's start with the facts. Product Owner is a role defined by Scrum, a specific methodology with a specific scope. In the Scrum framework, the Product Owner manages the backlog, prioritizes work, plans sprints, and communicates delivery status to stakeholders. Within its intended scope (execution within a squad) this is a perfectly valid role.</p>
<p>The Product Owner role proliferated alongside the adoption of scaled agile frameworks like SAFe, particularly in large enterprises trying to bring structure to dozens or hundreds of product teams. It became a way to ensure every squad had someone accountable for "what gets built next."</p>
<p>The role serves a purpose for companies that embrace Scrum, but it has significant limitations. With such a strong emphasis on operational excellence and delivery, Product Owners tend to focus on <strong>outputs rather than outcomes</strong>. They become skilled at managing sprints, refining user stories, and keeping stakeholders informed, but they rarely step back to ask whether the work delivers real value to customers or moves key business metrics. That strategic component is often neglected or taken for granted.</p>
<h2>When things go south: the anti-patterns</h2>
<p>Many enterprises tried to address the Product Owner role's limitations by introducing a top-down approach to "inject" that strategic component into teams. Since this is a gray area in Scrum methodologies, I've seen at least two major anti-patterns, each dysfunctional in its own way.</p>
<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/673ca28eaf27dbd59d38eb71/7b3d1e1e-2b65-4a80-b9b7-080fd9b05502.jpg" alt="The strategist sits on an ivory tower, while people on the ground look at a strategy plan confused." style="display:block;margin:0 auto" />

<h3>Anti-pattern #1: The Role Proliferation</h3>
<p>Some enterprises create a two-tier system: Product Managers who set strategy, and Product Owners who execute it. On paper, this sounds like a clean division of labor. In practice, it's often a disaster.</p>
<p><strong>The strategist loses touch with reality.</strong> When you're not involved in day-to-day prioritization and delivery, you lose the essential feedback loop that tells you whether your strategy actually works. You miss the technical constraints, the user journey friction, the implementation edge cases that should inform strategic decisions. For more complex products, you might even lose touch with the product itself. You are too distant from its users to develop empathy with them.</p>
<p><strong>The executor loses agency.</strong> When you're handed a strategy from above and told to "just execute it," you become an order-taker. You can't make intelligent trade-offs because you don't fully understand the commercial reasoning behind the work. You're managing a backlog, not owning outcomes.</p>
<h3>Anti-pattern #2: The Project Management Masquerade</h3>
<p>Other companies go a different route: they only have Product Owners, and these Product Owners report to technical leadership, such as engineering directors, rather than product leaders.</p>
<p>This usually happens in IT teams within organizations undergoing a "transformation" program of some kind. Usually, in this kind of teams, there is no <a href="https://blog.datachef.co/event-storming-context-mapping-data-ai-product-managers">product discovery practice</a>, and work originates from strategic initiatives handed down from above. In reality, these are <strong>glorified project management organizations</strong> where the Product Owner's job is simply to keep projects on track.</p>
<p>The Product Owner is never in charge of setting direction. They don't measure value, they measure compliance to a plan laid out by managers or, in the worst cases, by "the business."</p>
<p>And here's the telltale sign of this dysfunction: <strong>when someone in your IT organization refers to another part of the company as "the business," you have a problem.</strong> Because the Product Owner is supposed to <em>be</em> the business. They should understand the business's interests, contribute to them, and be evaluated on business outcomes, not (just) on timely delivery.</p>
<p>In this model, the Product Owner becomes a translator between stakeholders and engineers, a schedule tracker, a meeting organizer. They're valuable, but they're not product managers. And the organization doesn't have anyone truly owning the "why."</p>
<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/673ca28eaf27dbd59d38eb71/6282406b-9e59-40b0-9d7b-51061eb2c049.jpg" alt="Stakeholders give a list of projects to the Product Owner, who is coordinating the execution and monitoring the timelines." style="display:block;margin:0 auto" />

<h2>The model that actually works</h2>
<p>Both anti-patterns stem from the same flawed assumption: <strong>splitting product strategy from execution</strong> and delegating strategy to a Product Manager, the management team, or another part of the organization.</p>
<p><strong>Strategy and execution are not separate jobs. They are two sides of the same coin.</strong> The person making strategic bets needs to be close enough to implementation to course-correct. The person prioritizing daily work needs enough strategic context to make smart trade-offs without having to escalate every decision.</p>
<p>The split is artificial and creates more problems than it solves. It's far more natural for every product team to have one person who owns the "why" while staying accountable for what gets built. That role is usually called Product Manager, but the name doesn't matter—what matters is clarity about their skill set and responsibilities. Here's what they should bring to the table:</p>
<p><strong>Strong commercial instinct.</strong> They're obsessed with customers and users, but also with business outcomes. They constantly ask: Is this work helping us capture new customers? Retain existing ones? Cut our operational costs?</p>
<p><strong>Ownership of the "why."</strong> They own the problem space. What problem are we solving next? Why is this the most important thing? How will this impact the customer journey? And how will that translate to business outcomes? They can <a href="https://blog.datachef.co/product-management-engineering-yes">engineer a 'Yes'</a> when a critical opportunity surfaces mid-sprint, finding creative ways to pivot without creating chaos.</p>
<p><strong>Direct connection to execution.</strong> They work closely with engineers, understand technical constraints, and get their hands dirty by looking at data and analytics themselves. Not to micromanage, but because you can't make good strategic decisions without this ground truth.</p>
<p>All of the above is non-negotiable. Whether you're building data products, AI products, or consumer applications, this function must exist.</p>
<p>But it’s just as important to set boundaries and define what this person should <em>not</em> be burdened with:</p>
<ul>
<li><p>Writing detailed user stories (let tech leads translate PRDs into engineering work instead).</p>
</li>
<li><p>Managing team logistics and sprint ceremonies.</p>
</li>
<li><p>Tracking vanity metrics like velocity and story points.</p>
</li>
<li><p>Managing people, hiring, writing performance reviews, etc.</p>
</li>
</ul>
<p>These responsibilities dilute focus from what matters: understanding customer problems and driving business impact.</p>
<h2>Outcomes, not backlogs</h2>
<p><strong>Stop debating whether to call people Product Managers or Product Owners. Start asking whether they own outcomes or just backlogs.</strong></p>
<p>If someone on your team is responsible for a product but spends most of their time managing sprint logistics and tracking velocity, you have a process coordinator. Regardless of title.</p>
<p>If someone is setting product strategy but hasn't talked to a customer or used their own product in weeks, you have a strategist in an ivory tower. Regardless of title.</p>
<p>If someone takes orders from "the business" and measures success by on-time delivery and budget rather than impact, you have a project manager. Regardless of title.</p>
<p>What every product team needs is someone who owns the complete loop: from problem identification to solution delivery to impact measurement. Someone who thinks commercially, understands user journeys, and has the courage to say yes to the right opportunities even when it disrupts the plan.</p>
<p>Call them Product Manager. Call them Product Owner. Call them Chief Problem Solver.</p>
<p><strong>Just don't split the job in half and expect either piece to succeed.</strong></p>
]]></content:encoded></item><item><title><![CDATA[Spec-Driven Development: Ship Features with AI Guardrails]]></title><description><![CDATA[Something has fundamentally shifted in how we build software.
A year ago, if you told us we'd ship a production web application, 8 major features, 11 releases, 17,000 lines of TypeScript across a full]]></description><link>https://blog.datachef.co/spec-driven-development-ship-features-with-ai-guardrails</link><guid isPermaLink="true">https://blog.datachef.co/spec-driven-development-ship-features-with-ai-guardrails</guid><category><![CDATA[Spec-Driven-Development]]></category><category><![CDATA[AI coding]]></category><category><![CDATA[Developer Tools]]></category><category><![CDATA[Software Engineering]]></category><dc:creator><![CDATA[Mohsen Hasani]]></dc:creator><pubDate>Thu, 12 Feb 2026 11:51:58 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1771244526916/ed49fcae-2810-42f2-864e-e0420a452c32.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Something has fundamentally shifted in how we build software.</strong></p>
<p>A year ago, if you told us we'd ship a production web application, 8 major features, 11 releases, 17,000 lines of TypeScript across a full-stack React + Express codebase with AI writing virtually all the code, we'd have smiled politely and moved on. But that's exactly what happened. And the interesting part isn't the AI. It's what we put around the AI.</p>
<h2>The old way is gone (and that's okay)</h2>
<p>Let's be honest: writing every line of code by hand is no longer the most productive way to build software. AI coding assistants have gotten remarkably good. They can scaffold components, implement API endpoints, refactor patterns across files, and even reason about architecture. We've reached a point where the bottleneck isn't typing code, it's knowing what to type.</p>
<p>And that's where most AI-assisted development goes sideways.</p>
<p>Hand an LLM a vague prompt like "build me a dashboard," and you'll get something. It might even look impressive for five minutes. But it won't match your design system. It won't follow your project's conventions. It won't handle the edge cases your users will inevitably find. And good luck maintaining it three months later when nobody (including the AI) remembers why certain decisions were made.</p>
<p>We needed a better approach. We needed to trust the AI, but with guardrails.</p>
<h2>Enter spec-driven development</h2>
<p>The idea is simple: don't start with code. Start with a specification.</p>
<p>Before a single line of code gets written, every feature goes through a structured pipeline:</p>
<ol>
<li><p><strong>Specify:</strong> Describe what you want in plain language. The system generates a formal specification with user stories, functional requirements, success criteria, and acceptance scenarios.</p>
</li>
<li><p><strong>Clarify:</strong> The spec gets challenged. Are the requirements testable? Are there ambiguities? Edge cases? Up to 5 targeted questions are asked and the answers get encoded back into the spec.</p>
</li>
<li><p><strong>Plan:</strong> Technical architecture gets designed. Research is conducted on the best patterns for each problem. Data models, API contracts, and integration scenarios are documented. Every decision gets a rationale and alternatives considered.</p>
</li>
<li><p><strong>Tasks:</strong> The plan gets broken down into a precise, dependency-ordered task list. Each task has an ID, exact file paths, parallel execution markers, and belongs to a specific user story. Sequential vs. parallel execution is explicitly defined.</p>
</li>
<li><p><strong>Implement:</strong> Tasks are executed phase by phase, respecting dependencies. Parallel tasks run simultaneously. Each completed task gets checked off. TypeScript compilation and tests are verified at the end.</p>
</li>
</ol>
<p>This is what <a href="https://github.com/github/spec-kit">spec-kit</a> does. It's an open-source specification framework that turns natural language into structured, validated, implementation-ready specifications, designed specifically for AI-assisted development. We adopted it at DataChef for one of our projects, and the results surprised us.</p>
<h2>What this looks like in practice</h2>
<p>Here's a real example from our project. We needed to replace pagination with infinite scrolling across 5 different table views. Instead of diving into code, we started with:</p>
<blockquote>
<p>"Remove the paginations and load all data in every table, but keep the scroll inside the table, not for the whole page."</p>
</blockquote>
<p>That single sentence went through the spec-kit pipeline and produced:</p>
<p>A specification with 2 user stories, 8 functional requirements, and 6 success criteria A research document with 5 technical decisions (CSS strategies for sticky headers, flex layout patterns, data volume analysis) API contracts documenting exactly how 4 endpoints would change A task list of 14 tasks across 3 phases, with explicit parallel execution groups A quickstart checklist with 30+ manual verification items</p>
<p>The AI then executed all 14 tasks (removing pagination from shared types, simplifying React Query hooks, updating 5 page components with sticky headers and scroll containers) and produced zero TypeScript errors and zero test failures.</p>
<p>The whole feature, from English sentence to working code, followed a traceable path where every decision was documented.</p>
<h2>Why guardrails matter more than prompts</h2>
<p>Here's what we learned after shipping 8 features this way:</p>
<p><strong>Specifications are the real prompt engineering</strong>. A well-structured spec with clear acceptance criteria gives AI everything it needs to generate correct code. We stopped tweaking prompts and started improving specifications.</p>
<p><strong>Research prevents costly mistakes</strong>. For one feature, the research phase discovered that our data volumes (under 200 records per table) didn't justify the complexity of incremental loading. That decision (documented with rationale) saved us from over-engineering. Without the research step, the AI would have happily built an unnecessarily complex infinite scroll system.</p>
<p><strong>Parallel task execution is a superpower</strong>. Because spec-kit identifies which tasks touch different files, multiple AI agents can work simultaneously. In one phase, 5 agents updated 5 different page components in parallel. What would have been sequential 20-minute work completed in under 3 minutes.</p>
<p><strong>The spec is the documentation</strong>. Every feature has a specs/ directory with its specification, plan, research decisions, API contracts, and task history. Three months from now, when someone asks "why does the dashboard scroll differently from the tables?", the answer is in specs/008-table-infinite-scroll/<a href="http://research.md">research.md</a>, decision R-002.</p>
<h2>The numbers</h2>
<p>Across our project, spec-driven development with AI produced:</p>
<table>
<thead>
<tr>
<th><strong>Metric</strong></th>
<th><strong>Count</strong></th>
</tr>
</thead>
<tbody><tr>
<td>Features shipped</td>
<td>8</td>
</tr>
<tr>
<td>Production releases</td>
<td>11</td>
</tr>
<tr>
<td>Completed tasks</td>
<td>190</td>
</tr>
<tr>
<td>Spec artifacts generated</td>
<td>60</td>
</tr>
<tr>
<td>Application source code</td>
<td>17,000 lines</td>
</tr>
<tr>
<td>Languages/frameworks</td>
<td>TypeScript, React 19, Express 5, TailwindCSS 4</td>
</tr>
</tbody></table>
<p>Every one of those 190 tasks was derived from a specification, planned with documented rationale, and executed with explicit dependencies. Not a single feature started with "just write the code."</p>
<h2>Trust, but verify</h2>
<p>We're not saying AI writes perfect code. It doesn't. During this project we caught edge cases, fixed styling inconsistencies, and debugged issues that the AI missed. But the spec-driven approach changed when we caught those problems.</p>
<p>Instead of discovering architectural mistakes after 500 lines of code, the specification and planning phases surface them before any code exists. Instead of untangling spaghetti from a freeform AI session, every change maps to a task that maps to a user story that maps to a requirement.</p>
<p>The AI is the engine. The specification is the steering wheel.</p>
<h2>What's next</h2>
<p>Spec-kit is open source and designed to work with any AI coding assistant, Claude Code, Cursor, Copilot, or whatever comes next. We'll keep using it and sharing what we learn along the way.</p>
<p>If you're building software with AI and finding that the output is unpredictable, hard to maintain, or doesn't match what you actually needed, the problem probably isn't the AI. It's the input.</p>
<p>Give the AI a specification, not a wish.</p>
<h2>Other tools in this space</h2>
<p>Spec-driven development is picking up momentum. If you're exploring the approach, here are the tools worth looking at:</p>
<p><a href="https://github.com/github/spec-kit">spec-kit</a>: The one we used. GitHub's open-source toolkit that provides templates, a CLI, and prompts to move work through specify → plan → tasks → implement. Agent-agnostic, works with Claude Code, Copilot, Gemini, and others.</p>
<p><a href="https://kiro.dev">Kiro</a>: An AI IDE by AWS (VS Code fork) with spec-driven development built in. You describe requirements in natural language, Kiro generates user stories, technical design docs, and implementation tasks. Great for teams that want the workflow embedded in the editor itself.</p>
<p><a href="https://tessl.io">Tessl</a>: An agent enablement platform with a CLI that doubles as an MCP server. Its registry indexes 1,000+ reusable skills and docs for 10,000+ packages, keeping agent context version-matched to your dependencies. Focuses on making coding agents more effective through structured, versioned context.</p>
<p><a href="https://github.com/Fission-AI/OpenSpec">OpenSpec</a>: A lightweight, fluid alternative that works with 20+ AI assistants. Uses an action-based workflow (proposal → specs → design → tasks → implement) with no rigid phase gates, you can update any artifact at any time.</p>
<p>The tooling is still young, but the pattern is clear: the teams that give AI structured input will ship faster and more reliably than those prompting from scratch.</p>
]]></content:encoded></item><item><title><![CDATA[Event Storming and Context Mapping for Data & AI Product Managers]]></title><description><![CDATA[Discovery and requirements gathering are among the most critical phases of data product management. Yet they're also among the most challenging. Too often, data teams fall into the trap of being order-takers, responding to ad-hoc requests rather than...]]></description><link>https://blog.datachef.co/event-storming-context-mapping-data-ai-product-managers</link><guid isPermaLink="true">https://blog.datachef.co/event-storming-context-mapping-data-ai-product-managers</guid><category><![CDATA[#Domain-Driven-Design]]></category><category><![CDATA[Data Products]]></category><category><![CDATA[Product Management]]></category><dc:creator><![CDATA[Davide Rovati]]></dc:creator><pubDate>Fri, 30 Jan 2026 14:32:11 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1769776647356/6fb3751a-d6dd-47bd-b2f0-040cfb0dd7e3.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Discovery and requirements gathering are among the most critical phases of data product management. Yet they're also among the most challenging. Too often, data teams fall into the trap of being order-takers, responding to ad-hoc requests rather than proactively shaping a strategic roadmap. To truly unlock the value of data as a product, you need to reverse this narrative.</p>
<p>But how do you discover data products when stakeholders themselves might struggle to articulate what they need? How do you gather requirements and design solutions when everyone has a different mental model of how things work?</p>
<p>This is where <strong>EventStorming</strong> and <strong>Context Mapping</strong>—two powerful tools from the Domain-Driven Design community—become invaluable for data product managers.</p>
<h2 id="heading-the-discovery-and-requirements-challenge">The Discovery and Requirements Challenge</h2>
<p>Many stakeholders are accustomed to treating data as a commodity, something they simply request when needed. They may not have the vocabulary or mental models to think about data products strategically. As a data product manager, you need tools that bridge this gap and enable collaborative discovery and requirements gathering.</p>
<p>These are the key components in order to run effective sessions in which you manage to uncover the needs of your consumers:</p>
<ul>
<li><p><strong>Collaboration</strong> across diverse teams and functions</p>
</li>
<li><p>Deep understanding of how the organization actually operates</p>
</li>
<li><p><strong>A shared language</strong> to discuss complex processes and systems</p>
</li>
<li><p>Frameworks that help stakeholders articulate their needs</p>
</li>
</ul>
<h2 id="heading-enter-eventstorming-and-context-mapping">Enter EventStorming and Context Mapping</h2>
<p>While EventStorming and Context Mapping are distinct techniques, they share powerful principles that make them effective for data product discovery and design:</p>
<p><strong>Bring the right people together.</strong> Not managers or directors, but the people who do the actual work: those who feel the pain of inefficient processes and possess deep operational knowledge. These are your domain experts, even if they don't carry that title.</p>
<p><strong>Create a collaborative space.</strong> Whether it's a physical whiteboard or a virtual canvas, you need a shared modeling space where everyone can contribute freely. The key is psychological safety: there are no wrong answers, and all perspectives are valued.</p>
<p><strong>Build shared understanding.</strong> By visualizing processes, systems, and interactions together, teams develop a common language and mental model that transcends organizational silos.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1769776056483/ae2761d7-5c6b-442e-813a-0e8740fcd6d7.png" alt="A cartoon depicts a collaborative work environment. In the first panel, diverse workers at a table think &quot;We do the work!&quot; while an overseer observes. In the second panel, individuals discuss ideas on a board under a sign reading &quot;No wrong answers,&quot; with a light bulb symbolizing creative thinking." class="image--center mx-auto" /></p>
<h2 id="heading-context-mapping-in-action-displacing-legacy-and-discovering-future-state">Context Mapping in Action: Displacing Legacy and Discovering Future State</h2>
<p>We recently worked with a global retail company headquartered in the Netherlands undergoing a massive transformation: implementing a new data model to support their product lifecycle management. Dozens of systems across their IT landscape consumed data from the old model, each with custom point-to-point integrations built at different times using different technologies.</p>
<p>The challenge was about understanding a complex web of dependencies, many of which existed only in the minds of developers (some of whom had left the company). Knowledge gaps were everywhere.</p>
<p><strong>Context Mapping became our guide.</strong></p>
<p>We ran sessions with each team managing individual downstream systems, bringing together:</p>
<ul>
<li><p>End users who understood how the tools were actually used</p>
</li>
<li><p>Developers who had worked on the integrations</p>
</li>
<li><p>Anyone with contextual knowledge about edge cases and workarounds</p>
</li>
</ul>
<p>Together, we mapped the current landscape:</p>
<ul>
<li><p>How each integration was built and how it works today</p>
</li>
<li><p>Relationships between systems: upstream versus downstream, who conforms to whose standards, who introduces transformations, and where they happen</p>
</li>
<li><p>Boundaries between different contexts: what happens when something goes wrong? Who notices the problem, and who's responsible for fixing it?</p>
</li>
</ul>
<p>But Context Mapping isn't just about documenting the present. It helped us envision the future state: how would the introduction of the new product data model reshape this landscape? Who would be responsible for implementing each piece of the solution?</p>
<p>The sessions created something invaluable: <strong>shared ownership.</strong> Everyone who would play a role in the migration gained buy-in on the solution. We emerged with a clear understanding of implications, impacts, and responsibilities for successful adoption.</p>
<h2 id="heading-eventstorming-in-action-bringing-the-bottleneck-into-the-picture">EventStorming in Action: Bringing the Bottleneck into the Picture</h2>
<p>Within the same program, we hit a roadblock. The design of the new data model struggled to handle a specific edge case in the product lifecycle—an exception to the happy path that threatened to derail implementation.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1769776114222/fac9efc6-bc1c-49a2-8ab1-48ba12a22efd.png" alt="A cartoon shows four figures having an EventStorming session in front of a board. They appear confused, with question marks in speech bubbles. Another figure stands apart, looking worried, holding a puzzle piece labeled &quot;Critical Gap.&quot;" class="image--center mx-auto" /></p>
<p>This is precisely where <strong>EventStorming</strong> shines.</p>
<p>Same principle: bring in people who do the operational work day-to-day. But rather than focusing on systems and bounded contexts, start with a seed event to kick-start the conversation. No constraints for the participants in terms of time and space, just a huge whiteboard to build collaboratively a timeline of events.</p>
<p>When disagreements emerge about what happens between events, or the correct sequencing, that's your signal as facilitator to dig deeper:</p>
<ul>
<li><p>What are the <strong>policies</strong> that govern these transitions?</p>
</li>
<li><p>Which <strong>actors and</strong> <strong>systems</strong> are involved?</p>
</li>
<li><p>What can happen in parallel and what cannot?</p>
</li>
</ul>
<p>Once we had business experts and engineers in the same room, it was easy to construct an accurate timeline of events that allowed everyone to acknowledge the critical gap in the design of the data model. EventStorming helped us seeing the big picture first, and then modeling the chaos, with the support of a standardized grammar to ensure that everybody was using the same language.</p>
<p>The process also served another crucial purpose: the engineers who would implement the solution gained both technical and business knowledge about the purpose and processes behind what they were building. This deep understanding would prove invaluable during development.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1769776255294/6e27fb60-d117-48f6-8e85-ca0c5d21c925.png" alt="The same figures are having an EventStorming session and the &quot;Critical Gap&quot; piece of the puzzle is now on the board as everyone looks more relaxed." class="image--center mx-auto" /></p>
<h2 id="heading-why-these-tools-matter-for-data-product-managers">Why These Tools Matter for Data Product Managers</h2>
<p>As data product managers, our success hinges on our ability to:</p>
<ul>
<li><p><strong>Build empathy</strong> with users and understand their Job-To-Be-Done</p>
</li>
<li><p><strong>Identify bottlenecks</strong> and pain points in existing processes</p>
</li>
<li><p><strong>Unite the team</strong> around the "why" behind what we're building</p>
</li>
<li><p><strong>Deliver solutions</strong> that create real business value</p>
</li>
</ul>
<p>EventStorming and Context Mapping are strategic tools that give you structured, collaborative frameworks to achieve all of this. They transform discovery from a vague, frustrating exercise into a concrete, energizing process.</p>
<p>Team members will enter the room as strangers (or in the worst cases, adversaries!) and leave with a shared understanding of what needs to be done next.</p>
<h2 id="heading-getting-started-your-path-to-collaborative-discovery">Getting Started: Your Path to Collaborative Discovery</h2>
<p>Ready to transform your discovery process? Here's how to begin:</p>
<ol>
<li><p><strong>Start small.</strong> Pick one complex process or migration that would benefit from collaborative mapping. Use it as a pilot to demonstrate value.</p>
</li>
<li><p><strong>Invest in facilitation skills.</strong> These workshops require skilled facilitation to create psychological safety and guide productive conversations. Consider bringing in experienced facilitators for your first sessions.</p>
</li>
<li><p><strong>Focus on the right participants.</strong> Remember: doers over observers.</p>
</li>
<li><p><strong>Embrace messiness.</strong> The most valuable insights often emerge from disagreements and confusion—these are signs you're uncovering hidden complexity that needed to be surfaced.</p>
</li>
<li><p><strong>Document, but don't over-formalize.</strong> The real value is in the shared understanding built during the session, not just the artifacts produced.</p>
</li>
</ol>
<p>Discovery doesn't have to be a shot in the dark. With the right collaborative tools and facilitation, you can turn it into your competitive advantage.</p>
<p><strong>Want to go deeper?</strong> We're launching <strong>DataChef Academy</strong>, where you'll learn how to run effective EventStorming and Context Mapping workshops for data products. <a target="_blank" href="https://www.datachef.academy/">Join our waitlist</a> to be the first to know when these hands-on workshops become available.</p>
]]></content:encoded></item><item><title><![CDATA[Product Managers and the Art of Engineering a 'Yes']]></title><description><![CDATA[If you spend any time reading about product management online, you'll frequently encounter a piece of advice: saying no is the product manager's superpower. There are countless guides on how to say no, why to say no, and frameworks for saying no with...]]></description><link>https://blog.datachef.co/product-management-engineering-yes</link><guid isPermaLink="true">https://blog.datachef.co/product-management-engineering-yes</guid><category><![CDATA[Product Management]]></category><category><![CDATA[product]]></category><category><![CDATA[Roadmap]]></category><dc:creator><![CDATA[Davide Rovati]]></dc:creator><pubDate>Mon, 19 Jan 2026 15:02:58 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1768834480384/702b1c86-fb6a-4b50-ba1f-0d3bf0141056.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you spend any time reading about product management online, you'll frequently encounter a piece of advice: <strong>saying no is the product manager's superpower</strong>. There are countless guides on how to say no, why to say no, and frameworks for saying no with empathy. The narrative has become so dominant that many aspiring PMs believe their primary job is to be the gatekeeper who protects the team from distractions.</p>
<p>This isn't entirely wrong: saying no <em>is</em> part of the job, of course. But framing it as the defining skill of product management is dangerous. Worse, it creates a culture where product managers optimize for the wrong outcomes: process compliance, roadmap rigidity, and risk aversion.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1768824499767/263a39d3-6cb1-425f-936d-831910cbd858.png" alt="Illustration of a &quot;PM&quot; sitting at a desk with a roadmap, saying &quot;No&quot; to a line of people with feature requests. Another group of people (the team and management) appear happy, thinking &quot;Thanks for the focus!&quot;" class="image--center mx-auto" /></p>
<h2 id="heading-why-saying-no-feels-like-a-superpower">Why saying "no" feels like a superpower</h2>
<p>Let's be honest: saying no is actually easy. You point at a roadmap. You cite committed priorities. You deflect by adding stuff to the backlog. Done. It's a defensive move that requires little courage and no creativity. Why do I think it’s easy? For at least a couple of reasons.</p>
<p><strong>You’re protected by the process.</strong> Especially in organizations that operate with rigid methodologies and strong commitments to roadmaps and planning cycles. When you have a well-defined plan and established priorities, declining new requests becomes a routine exercise that is actually encouraged by organizational principles.</p>
<p><strong>It provides immediate social rewards.</strong> Your engineering team appreciates that you're shielding them from constant context switching and scope creep. Your manager values that you're staying disciplined, following the process, and delivering what you had committed to deliver. There's a pleasant sense of being the responsible adult in the room, protecting everyone from chaos.</p>
<p>But here's the uncomfortable truth: <strong>there’s nothing special about saying “no”</strong>. In most product management careers, there will be far more instances of saying no than yes. The opportunities you <em>don't</em> pursue will always outnumber the ones you do.</p>
<h2 id="heading-the-pm-who-always-says-no-is-perceived-as-an-obstacle">The PM who always says “no” is perceived as an obstacle</h2>
<p>Worse, encouraging PMs to default to "no" enables a dangerous pattern: product managers who reject everything outside the plan train stakeholders to work around them.</p>
<p>When business stakeholders or customers present something urgent, they're not trying to derail your carefully crafted plans. They're bringing you a problem they believe is important. In many cases, they're right. They're closest to the customer pain, the market dynamics, the competitive threats. They have information you might not have.</p>
<p>Great product managers recognize this and treat it as valuable input, not an annoyance to be deflected. They ask questions: Why is this urgent now? What happens if we don't address this? Who is affected and how severely? What outcome are you trying to achieve? After asking questions, they ponder and offer options.</p>
<p>When product managers consistently demonstrate the <strong>willingness to find a way forward</strong>, they build tremendous stakeholder confidence. Stakeholders learn that this person is thinking first and foremost about solving the right problems, not blinded by process for the sake of process. They will see them as a partner in achieving business outcomes.</p>
<p>Conversely, product managers who shut down conversations about unplanned work will be seen as obstacles. Stakeholders lose patience after hearing for the third time that their customer pain point is "in the backlog." They escalate to executives, who then mandate the work anyway. In the worst cases, stakeholders might even secure budget to build shadow solutions outside your product. In the process, you lost the chance to own the problem space. Worse, you lost credibility and authority, neither of which is easily regained.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1768824511250/4bf5c107-0b29-4968-8693-96f781c3d295.png" alt="Comic illustration with two stick figure characters. One says they need a &quot;shadow&quot; solution without informing the PM. The other hands over money and says, &quot;Here's the budget. Bypass them.&quot;" class="image--center mx-auto" /></p>
<h2 id="heading-what-actually-defines-great-product-leaders">What actually defines great product leaders</h2>
<p>Great product management is about <strong>understanding the "why" before the "what" and "how."</strong> A product manager's core responsibility is to own the “why”, i.e., to maintain focus on business outcomes and end-user needs.</p>
<p>This means that the real differentiator of their success is not <em>how often</em> they say no, but <strong>which opportunities they say yes to</strong> and the impact they will generate on the business.</p>
<p>The challenge—and the art—lies in spotting the right opportunities that make you go: <em>"Yes, we're going to change our roadmap. We've found something that is actually more important and more impactful to do."</em></p>
<p>This shouldn't happen every day. If you're constantly changing direction, you create chaos and undermine trust. But when the right opportunity presents itself, great product leaders (at any level: ICs, directors, C-level) have the instinct to recognize it and the courage to act on it.</p>
<h2 id="heading-the-anatomy-of-engineering-a-yes">The anatomy of engineering a "yes"</h2>
<p>So, what does it mean to <em>engineer</em> a “yes”? At DataChef, we believe it involves three things that are part of the core skill set of a good product manager.</p>
<p><strong>Rapid validation.</strong> When a potentially important opportunity surfaces, skilled product managers quickly validate whether it's worth pursuing. They have a strong instinct that helps them uncover the underlying problem, even when the pain point is poorly articulated or buried in raw feedback. They spot patterns and connect the dots, tying new requests to problems that surfaced earlier or recognizing similarities with already-prioritized work.</p>
<p><strong>Creative solution design.</strong> Here's where the <em>engineering</em> comes in. Saying yes doesn't mean accepting the proposed solution at face value or dropping everything else. It means coming up with options to address the high-impact problem while managing constraints. For example, descoping less critical work, negotiating scope to deliver 80% of the value with 20% of the effort, or identifying overlaps with existing priorities that address the same need.</p>
<p><strong>Stakeholder alignment without creating uncertainty.</strong> The hardest part of engineering a “yes” is doing so without creating chaos in the team or losing stakeholder confidence. This requires clear communication about why this opportunity matters more than what was planned and transparency about what's being deprioritized.</p>
<h2 id="heading-its-on-leaders-to-build-a-culture-of-options">It’s on leaders to build a culture of options</h2>
<p>If you’re a Director of Product, or a Chief Product Officer, here’s my advice to you. Companies get the product management they incentivize. If you reward product managers for sticking to the plan, minimizing stakeholder requests, and keeping teams busy with predetermined work, you'll get gatekeepers and coordinators. In many organizations, especially in Europe, that is the status quo.</p>
<p>If instead you reward product managers for delivering measurable business outcomes, solving high-impact customer problems, and building strong stakeholder relationships, you'll get product managers who think creatively and take calculated risks while driving real value.</p>
<p>The real trap is confusing process compliance with product excellence. Leaders should be careful about the metrics they pick to measure success and the stories they promote as virtuous examples.</p>
<p>And when performance review time comes, remember: it takes courage to engineer a “yes”. The product managers who get invited to strategic conversations, the ones who build loyal followings among stakeholders and teams, are the ones who figure out how to say yes to the right things.</p>
<p>So the next time you hear someone say "the product manager's superpower is saying no," push back. <strong>Because at the end of the day, nobody remembers the product manager who was really good at saying no. They remember the product manager who found a way to solve the most important problems, even when it wasn't easy.</strong></p>
]]></content:encoded></item><item><title><![CDATA[Copilot VS. Custom LLM: Navigating the Generative BI Landscape]]></title><description><![CDATA[For years, Business Intelligence has followed a familiar, often frustrating pattern. An executive has a question, an analyst hunts for the answer, engineers build data pipelines, and eventually—days or weeks later—a dashboard appears. By the time the...]]></description><link>https://blog.datachef.co/copilot-vs-custom-llm-navigating-the-generative-bi-landscape</link><guid isPermaLink="true">https://blog.datachef.co/copilot-vs-custom-llm-navigating-the-generative-bi-landscape</guid><category><![CDATA[Generative BI]]></category><category><![CDATA[conversational bi]]></category><category><![CDATA[PowerBI]]></category><category><![CDATA[copilot]]></category><dc:creator><![CDATA[Ali Mohammadzadeh]]></dc:creator><pubDate>Tue, 16 Dec 2025 09:01:17 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1765549252488/4f94da5b-bf49-4524-af3d-57e900504de3.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>For years, Business Intelligence has followed a familiar, often frustrating pattern. An executive has a question, an analyst hunts for the answer, engineers build data pipelines, and eventually—days or weeks later—a dashboard appears. By the time the insights arrive, the decision window has often closed.</p>
<p>We are now entering a new stage of analytics. We are shifting away from static dashboards and moving toward <strong>conversational intelligence</strong>: unlocking an organization's institutional memory through natural language.</p>
<p>The question is no longer <em>if</em> you should adopt <a target="_blank" href="https://hashnode.com/post/cmira5zvy000502l764k49fv4">Generative BI</a>, but <em>how</em>. Do you lean into the Microsoft ecosystem with Copilot, or does your context justify a custom architecture running on your own terms?</p>
<h3 id="heading-copilot-vs-custom-quick-comparison">Copilot vs Custom: Quick Comparison</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Dimension</strong></td><td><strong>Copilot in Power BI</strong></td><td><strong>Custom Open-Source Stack</strong></td></tr>
</thead>
<tbody>
<tr>
<td>Time to value</td><td>Fast if you already run on Fabric/Premium</td><td>Medium; depends on infra readiness and team</td></tr>
<tr>
<td>Governance</td><td>Native Entra ID, RLS/OLS, Purview lineage</td><td>Can align with existing infra, IAM, and compliance needs; governance, logging, and lineage must be designed</td></tr>
<tr>
<td>Cost model</td><td>Fixed capacity + licenses</td><td>Variable; infra + pay-per-inference</td></tr>
<tr>
<td>Model choice</td><td>Microsoft-supported models</td><td>Bring-your-own; swap models freely</td></tr>
<tr>
<td>Data access</td><td>Optimized for Power BI semantic models</td><td>Direct-to-warehouse and docs via RAG/agents</td></tr>
<tr>
<td>Residency &amp; privacy</td><td>Cloud with tenant and regional controls</td><td>Private VPC or on‑prem, air‑gapped possible</td></tr>
<tr>
<td>Flexibility</td><td>Tight Microsoft ecosystem integration</td><td>High; full control of components and routing</td></tr>
<tr>
<td>Operations</td><td>Managed; lower operational cost</td><td>Here operational cost is higher because you manage everything yourself.</td></tr>
</tbody>
</table>
</div><p><strong>Note:</strong> A custom open-source stack typically combines components like LangChain/LlamaIndex for orchestration, open-source models (Llama/Mistral), a vector store (Elasticsearch/pgvector/Pinecone), and your existing warehouse/lake, all deployed within your current infrastructure (e.g., Kubernetes, VPC, existing IAM).</p>
<h2 id="heading-the-promise-power-bi-with-copilot"><strong>The Promise: Power BI with Copilot</strong></h2>
<p>Microsoft has integrated Copilot into Power BI, and for many organizations this is the most direct path into Generative BI. The vision is to remove technical barriers, allowing decision-makers to access data as easily as sending a message.</p>
<p>Imagine a sales director who needs to prepare for a quarterly review. Instead of filtering through slicers on a complex report, they type:</p>
<blockquote>
<p>“Show me sales performance for Q3 compared with Q2. Break it down by region and point out the three weakest products.”</p>
</blockquote>
<p>Copilot prepares a visual and a short explanation of why performance changed.</p>
<p>They follow up with:</p>
<blockquote>
<p>"Why did our gross margin drop in October?"</p>
</blockquote>
<p>Copilot scans the dataset and explains:</p>
<blockquote>
<p><em>Gross margin fell by four percent because supply chain costs for Product Line X rose by fifteen percent.</em></p>
</blockquote>
<p>This is the promise: simple questions, instant answers, automatic visuals. No need to open a ticket or wait for the next sprint.</p>
<h2 id="heading-where-copilot-shines"><strong>Where Copilot Shines</strong></h2>
<p>It is important to be clear where Microsoft has built a real advantage. If your data strategy is centered on Azure, Copilot offers benefits that are hard to match with a purely custom stack:</p>
<ul>
<li><p><strong>Seamless integration:</strong> It lives where your users already work: inside Power BI, Teams, and Excel. That matters for adoption.</p>
</li>
<li><p><strong>Identity &amp; Governance:</strong> It leverages <strong>Entra ID</strong> (formerly Azure AD) for robust Role-Based Access Control (RBAC). If a user is restricted by row-level security, Copilot will not see that data either. You get consistent access rules from reports down into conversational queries.</p>
</li>
<li><p><strong>Auditability:</strong> With <strong>Microsoft Purview</strong> integration, you get built-in lineage, cataloging, and sensitivity labeling. This goes beyond just security. Analysts can trace where an answer came from, compliance teams can reason about risk, and executives can trust that numbers are not appearing from nowhere.</p>
</li>
</ul>
<h2 id="heading-the-decision-matrix-choosing-your-path"><strong>The Decision Matrix: Choosing Your Path</strong></h2>
<p>While Copilot is powerful, it is not a universal answer. Implementing Generative BI via the standard Microsoft route comes with specific infrastructure requirements and constraints that may not fit every agile or cost-conscious business.</p>
<p>Here is a framework to help you decide which approach fits your organization.</p>
<h3 id="heading-when-copilot-power-bi-is-the-right-answer">When Copilot + Power BI is the Right Answer</h3>
<p>Staying with the official Microsoft platform is often the best choice if:</p>
<ol>
<li><p><strong>You are a "Microsoft Shop":</strong> Your organization already uses Entra ID, Purview, and Power BI Premium or Fabric capacity. Your BI teams are comfortable with Power BI as the main semantic layer.</p>
</li>
<li><p><strong>You have solid semantic models:</strong> You have well-maintained semantic models. Copilot relies heavily on the quality of the underlying semantic layer; it struggles with messy or unstructured data.</p>
</li>
<li><p><strong>Governance and auditability are top priorities:</strong> You want audit logs, sensitivity labels, and lineage in a form that security and compliance already understand. You would rather inherit this from Entra + Purview than rebuild it.</p>
</li>
<li><p><strong>You accept the Fabric/Premium commitment:</strong> You are willing to pay the "Fabric Tax" (committing to Fabric or Premium capacities) for the ease of a fully managed service.</p>
</li>
</ol>
<h3 id="heading-when-a-custom-open-source-solution-makes-sense">When a Custom Open-Source Solution Makes Sense</h3>
<p>A custom Generative BI layer, built with open-source components and your own warehouse, becomes compelling when:</p>
<ol>
<li><p><strong>You have strict data-residency or “offline” needs:</strong> In industries like defense, healthcare, or finance, sending data to a public cloud-hosted LLMs is not acceptable. You may need to run open-source models like Llama 3 or Mistral on-premise or in a private VPC where data never leaves your control.</p>
</li>
<li><p><strong>You care deeply about cost control and routing:</strong> Capacity-based billing for Fabric means you pay whether Copilot runs one query or one million. With a custom solution, you have control over model routing, allowing you to leverage low-cost models for most use cases and reserve more expensive reasoning models for complex tasks.</p>
</li>
<li><p><strong>Your “truth” lives in the warehouse, not only in Power BI:</strong> You want your conversational interface to talk directly to gold tables in Databricks, Snowflake, or other warehouses, and to combine that with unstructured documents via RAG and agents. Copilot is optimized for Power BI semantic models; a custom agent can be designed around your broader data estate.</p>
</li>
<li><p><strong>You want vendor flexibility:</strong> You want the freedom to swap the "brain" of your operation without changing your entire platform. If a new, faster model is released tomorrow, you want to plug it in immediately without waiting for a vendor update.</p>
</li>
</ol>
<h2 id="heading-the-friction-why-copilot-isnt-for-everyone">The Friction: Why Copilot Isn’t for Everyone</h2>
<p>While Copilot’s capabilities are undeniably impressive, adopting it via the standard route often comes with substantial prerequisites and constraints — a reality that can be a barrier for agile or cost-sensitive organizations.</p>
<ul>
<li><p><strong>“Capacity” costs required</strong>: To enable Copilot in Power BI (or the larger Microsoft Fabric ecosystem), your workspace must be hosted on a <strong>paid capacity</strong> — either Fabric capacity (F-SKU) or Power BI Premium capacity (P1 or higher). It’s not sufficient to only have a free or trial license. <a target="_blank" href="https://learn.microsoft.com/en-us/power-bi/create-reports/copilot-enable-power-bi">Microsoft Learn</a></p>
</li>
<li><p><strong>Fixed overhead, even if usage is low</strong>: Because capacity-based billing applies regardless of actual usage, you may incur infrastructure costs even if few Copilot queries are made. This can be a heavy commitment — especially for organizations with sporadic or unpredictable usage.</p>
</li>
<li><p><strong>Ecosystem lock-in</strong>: Copilot is deeply tied to Microsoft infrastructure, APIs, and model choices. That is a benefit if you are all‑in on the stack. It is a constraint if you want to experiment with open-source models, alternative warehouses, or a multi‑cloud architecture.</p>
</li>
<li><p><strong>Compliance &amp; data-residency restrictions</strong>: For industries with strict regulatory requirements (finance, healthcare, etc.), using Copilot via cloud-based LLMs can raise concerns about data residency, governance, and compliance. While there are regional and tenant-level settings in Fabric, they require careful configuration. <a target="_blank" href="https://learn.microsoft.com/en-us/fabric/fundamentals/how-copilot-works">Microsoft Learn</a></p>
</li>
</ul>
<h2 id="heading-the-datachef-approach-co-designing-your-architecture"><strong>The DataChef Approach: Co-Designing Your Architecture</strong></h2>
<p>At DataChef, we do not sell a boxed “GenBI product” that competes with Microsoft. We help you <strong>design and implement the architecture that fits your constraints</strong>, which often includes Copilot.</p>
<p>For some clients, that means tightening their Power BI models, access rules, and Purview setup so Copilot can actually be trusted. For others, it means designing a custom “institutional memory” engine that runs in their own environment, side‑by‑side with existing BI.</p>
<p>Here are a few patterns we tend to design for:</p>
<h3 id="heading-1-specialized-security-amp-air-gapped-intelligence">1. Specialized Security &amp; "Air-Gapped" Intelligence</h3>
<p>When zero data leakage is non-negotiable:</p>
<ul>
<li><p>Deploy open-source models fully within your private cloud or on‑prem environment.</p>
</li>
<li><p>Fine‑tune models on your own jargon and metrics definitions so answers reflect how <em>your</em> organization speaks and reasons.</p>
</li>
<li><p>Keep all prompts, logs, and embeddings inside your own perimeter.</p>
</li>
</ul>
<h3 id="heading-2-flexible-intelligence-through-model-routing">2. Flexible Intelligence Through Model Routing</h3>
<p>When cost and performance both matter:</p>
<ul>
<li><p>Route simple, high‑volume queries to small, local models.</p>
</li>
<li><p>Reserve top‑tier hosted models (e.g. Claude, GPT‑4 class) for complex reasoning, planning, or edge cases.</p>
</li>
<li><p>Evaluate and swap models as the landscape evolves, without changing your front‑end experience.</p>
</li>
</ul>
<h3 id="heading-3-governance-by-design">3. Governance by Design</h3>
<p>When governance must span both Copilot and custom stacks:</p>
<ul>
<li><p>Integrate agents with your existing identity and access management (for example Entra or other IAM), so the same rules apply everywhere.</p>
</li>
<li><p>Log questions, answers, and underlying queries in your environment so you can:</p>
<ul>
<li><p>See what people are actually asking.</p>
</li>
<li><p>Identify gaps in your semantic models or warehouse.</p>
</li>
<li><p>Improve definitions and data products over time.</p>
</li>
</ul>
</li>
</ul>
<h2 id="heading-conclusion-partnering-for-the-future"><strong>Conclusion: Partnering for the Future</strong></h2>
<p>The shift to conversational BI is underway. The exact tool you use (Copilot, a custom agent, or both) matters less than whether you have:</p>
<ul>
<li><p>A clear view of your current BI and data platform.</p>
</li>
<li><p>An honest assessment of your governance and residency constraints.</p>
</li>
<li><p>A cost model that fits how you actually plan to use Generative BI.</p>
</li>
</ul>
<p>For many Microsoft‑centric organizations, the right move is to <strong>start with Copilot in Power BI</strong>, provided the semantic models and governance are in good shape. As soon as you hit residency, multi‑cloud, or advanced routing needs, it becomes worth designing a <strong>custom Generative BI lane</strong> alongside it.</p>
<p><strong>How we help</strong> If you are wrestling with this choice, a good first step is to take one or two concrete use cases - rather than your entire BI estate - and ask:</p>
<ul>
<li><p>Can Copilot, on top of our current models and governance, serve this well?</p>
</li>
<li><p>Where do residency, cost, or multi‑platform needs push us beyond what Copilot can reasonably cover?</p>
</li>
</ul>
<p>This is the kind of work we do with clients: we map your current BI setup, security requirements, and cost constraints, then co‑design the smallest viable next step - whether that is getting more value out of Copilot, adding a focused custom GenBI slice, or combining both into a coherent architecture.</p>
]]></content:encoded></item><item><title><![CDATA[Is Your Organization Ready for Data Mesh? A Practical Readiness Check]]></title><description><![CDATA[You are being asked to make a long‑term bet on data architecture.
Most conversations frame this as a choice between data lake and data mesh. Vendors, internal teams, and reference architectures encourage you to pick a side.

💡
“Data lake vs data mes...]]></description><link>https://blog.datachef.co/is-your-organization-ready-for-data-mesh-a-practical-readiness-check</link><guid isPermaLink="true">https://blog.datachef.co/is-your-organization-ready-for-data-mesh-a-practical-readiness-check</guid><category><![CDATA[Data Mesh]]></category><category><![CDATA[Organization Design]]></category><category><![CDATA[Data Products]]></category><category><![CDATA[team topologies]]></category><dc:creator><![CDATA[Bram Elfrink]]></dc:creator><pubDate>Thu, 11 Dec 2025 13:51:28 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/7hA2wqBcSF8/upload/45a11c230528bac1361708e3e96b1b64.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You are being asked to make a long‑term bet on data architecture.</p>
<p>Most conversations frame this as a choice between <strong>data lake</strong> and <strong>data mesh</strong>. Vendors, internal teams, and reference architectures encourage you to pick a side.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">“Data lake vs data mesh” often mixes up technology with operating model. You can run a lake or a warehouse in a centralized or decentralized way. In this article, we use “lake” as shorthand for a <strong>more centralized</strong> model, and “mesh” as shorthand for a <strong>more decentralized</strong> model, because that is how most organizations experience them in practice.</div>
</div>

<p>The more useful question is:</p>
<blockquote>
<p><strong><em>Does your organization have the structure, skills, and governance to operate the architecture you are buying into for the next 3-5 years?</em></strong></p>
</blockquote>
<p>If the answer is no, the right pattern on paper will not help in practice. You are making an operating‑model choice, not just a technology choice. If the model does not fit how your organization works, you lock in extra cost and rework, slower decisions, and a credibility problem for the data team.</p>
<p>If the match is roughly right, the effect is the opposite. You shorten time‑to‑change for important decisions, reduce shadow solutions, and build an architecture that supports the ambitions of the organization instead of fighting them.</p>
<p>This article is about that match. We will not argue for lake or mesh as abstract patterns. Instead, we will help you:</p>
<ul>
<li><p>Judge whether your organization is genuinely ready for a more decentralized model.</p>
</li>
<li><p>See what you can already apply from data mesh thinking on top of the central lake or warehouse you run today.</p>
</li>
</ul>
<p>By the end, you should be able to answer a practical question: <em>what level of decentralization can we operate safely now, and what can we grow into without breaking the business?</em></p>
<h2 id="heading-what-a-centralized-platform-is-and-when-it-fits">What a Centralized Platform Is (and When It Fits)</h2>
<p>In most organizations, the “data lake vs data mesh” decision is really a question of how centralized your platform is. A centralized platform — typically your data lake or warehouse — means one team owns the shared environment. Data lands and is modeled in one place, under one set of standards. Most meaningful changes flow through a single backlog and roadmap.</p>
<p>This is often the right starting point. You are moving from chaos to order. Data ownership in domains is still fuzzy. Your main pain is fragmentation and duplication, not yet a central bottleneck. In that context, a central platform behaves like a good monolith: one place to reason about data, one set of contracts, one team accountable when something breaks.</p>
<p>The trade‑off is clear. You concentrate capability, governance, and decision‑making in one place, and accept that many new use cases will queue behind the same team. At limited scale, this is usually a good deal. It lets you enforce standards, simplify tooling, and ship visible wins, especially if you are coming from scattered spreadsheets and side systems.</p>
<p>As volume, teams, and use cases grow, the same design can quietly become the throttle for change rather than the enabler. The central team’s backlog starts to define how quickly the business can move.</p>
<p>The question is not “is centralization bad?”, but “given how we actually work today, are we still at the scale where a single team can safely sit in the middle of everything?”</p>
<h2 id="heading-what-a-decentralized-platform-is-and-when-it-fits">What a Decentralized Platform Is (and When It Fits)</h2>
<p>At the other end of the spectrum is a decentralized platform — data mesh in practice. Instead of one central team owning most pipelines and models, stable domain teams own their <a target="_blank" href="https://blog.datachef.co/data-set-data-product-management">data products</a> end to end. They design the models, run the pipelines, and are accountable for quality and contracts. A central group still exists, but focuses on platform, guardrails, and enablement.</p>
<p>This model makes sense once your main problem is no longer “we have nothing consistent,” but “the central team is slowing everyone down.” Domains already move fast in their own products and services. They want similar control over the data that represents their part of the business, and they have enough engineering and analytics capability to carry that responsibility.</p>
<p>In that situation, a single central backlog has become the limit for how quickly the business can change. A decentralized model changes that constraint: you keep a strong shared platform, but move more decisions, design, and implementation into the domains that know the business best.</p>
<p>Moving to a decentralized model is like moving from a monolith to well‑designed services. You keep a shared foundation, but push ownership and change closer to where domain knowledge lives. When it fits, this gives you faster decisions, tighter alignment between data and reality, and fewer shadow solutions, because teams no longer need to route around a central bottleneck to get work done.</p>
<p>Here too, the question is not “are we doing data mesh?”, but “do we have the structure, skills, and trust to let domains own real data products on top of a shared platform, without collapsing into chaos?”</p>
<h2 id="heading-antipatterns-when-each-model-fails-in-practice">Antipatterns: When Each Model Fails in Practice</h2>
<p>Centralized and decentralized platforms both work when they match how the organization actually operates. Even when the technology is sound, the wrong match with your organization creates failure patterns that look very similar across companies.</p>
<h3 id="heading-forcing-data-mesh-into-an-unready-organization">Forcing Data Mesh into an Unready Organization</h3>
<p>You are likely forcing mesh too early if you recognize several of these signals:</p>
<ul>
<li><p><strong>Org shape vs. design do not match</strong></p>
<p>  You still run mainly as functions or projects (BI team, central data, “the business”), but your target diagram is full of tidy domains and data products. In practice, work and budgets still route through the old structure.</p>
</li>
<li><p><strong>Domains “own” data only on paper</strong></p>
<p>  You have named data product owners, but they have no time, no engineers, and no real mandate. At quarter‑end, they still send SQL snippets or requests back to the central team to “fix the numbers.”</p>
</li>
<li><p><strong>Everyone is rebuilding the basics</strong></p>
<p>  Each domain is trying to figure out its own ingestion, testing, and monitoring. There is no clear golden path from the platform. Teams ask each other “how did you solve X?” because nothing is shared.</p>
</li>
<li><p><strong>Platform and practices are still fluid</strong></p>
<p>  Tooling, environments, and standards are changing under people’s feet. New ways of doing things keep appearing before the old ones have settled. The idea of “letting every domain choose” feels like multiplying today’s instability.</p>
</li>
<li><p><strong>Big‑bang thinking instead of small, contained pilots</strong></p>
<p>  The transformation is framed as “we are doing data mesh” across the whole estate. There is no small, end‑to‑end domain where you can point and say: “this is working, here is how, here is what we learned.”</p>
</li>
</ul>
<p>If several of these resonate, the problem is not that mesh is a bad idea. It is that you are asking the organization to run an operating model it does not yet have the structure, skills, or platform to support.</p>
<h3 id="heading-staying-centralized-when-the-lake-team-is-the-bottleneck">Staying Centralized When the Lake Team Is the Bottleneck</h3>
<p>On the other side, you may be clinging to a central lake model after it has clearly outlived its scale. Common signals:</p>
<ul>
<li><p><strong>The central backlog is the throttle for business change</strong></p>
<p>  Most meaningful analytics or data changes must pass through one team. Lead times are counted in weeks or months. Leaders talk about “waiting for the data team” as a standard part of delivery.</p>
</li>
<li><p><strong>Shadow data and parallel stacks keep popping up</strong></p>
<p>  Teams run their own extracts, spreadsheets, and side databases “just for now.” Some units experiment with their own BI tools or cloud accounts because they see no other way to move.</p>
</li>
<li><p><strong>Trust in “official” data is eroding</strong></p>
<p>  People compare dashboards from different sources in meetings. Arguments start with “that’s not what my numbers say.” The central platform is still branded as “single source of truth,” but behaviour says otherwise.</p>
</li>
<li><p><strong>The platform is seen as something done <em>to</em> domains</strong></p>
<p>  Business teams feel they have little say in schemas or priorities and experience the central team as a gate, not a partner. Even when they use the official path, it feels misaligned with their reality, so they increasingly disengage from the central solution.</p>
</li>
<li><p><strong>Central teams are fighting symptoms, not causes</strong></p>
<p>  Most effort goes into patching fragile pipelines and firefighting incidents in unfamiliar domains. Root‑cause fixes require deep business context that sits with local teams, but ownership has never really moved there.</p>
</li>
</ul>
<p>If these signals are familiar, your issue is no longer “do we need more standards?” or “do we need a better tool?”. The central lake itself is now the limit for how fast you can change. That is usually the moment to start pushing ownership and accountability closer to domains, on top of a strong shared platform, instead of adding yet another layer of process around the same bottleneck.</p>
<h2 id="heading-a-readiness-lens-for-your-organization">A Readiness Lens for Your Organization</h2>
<p>Moving from a centralized approach to a more decentralized one is not a simple on–off switch. It needs a deliberate roadmap and preparation.</p>
<p>The questions below give you four key lenses to assess whether your organization is ready for that step. If your answer is “no” for one of them, the “<strong>Action</strong>” under that lens shows where to invest next so you move closer to a decentralized model you can actually run.</p>
<h3 id="heading-ownership">Ownership</h3>
<p>Ask:</p>
<ul>
<li><p>Have you already identified your domains?</p>
</li>
<li><p>Do domains already own services, APIs, or key analytics for their area?</p>
</li>
<li><p>Can you name accountable data owners with enough time and mandate, not just a title?</p>
</li>
</ul>
<p>If ownership is diffuse and always rolls back to “the data team”, you are not ready for a broad mesh.</p>
<p><strong>Action</strong>: Start by clarifying ownership inside a central lake model.</p>
<h3 id="heading-platform-maturity">Platform Maturity</h3>
<p>Ask:</p>
<ul>
<li><p>Do you have a shared platform for cross-cutting concerns, especially security and observability?</p>
</li>
<li><p>Or would each domain have to assemble its own solutions in order to build data products?</p>
</li>
</ul>
<p>If each domain would be rebuilding basics on a moving foundation, decentralization just multiplies instability.</p>
<p><strong>Action</strong>: Stabilize the central platform and define a “golden path” before pushing ownership out.</p>
<h3 id="heading-governance-and-trust">Governance and Trust</h3>
<p>When we talk about governance here, we mean the practical rules for how data is created, changed, and used across the organization. That includes data governance topics like definitions, quality, access, and lineage, and also the decision rights around who can approve changes, who owns which datasets, and how exceptions are handled. In other words: the minimal set of shared rules and decision paths that keep data trustworthy and compliant.</p>
<p>Ask:</p>
<ul>
<li><p>Are standards followed mainly because they are useful, or because a committee enforces them?</p>
</li>
<li><p>Can you define a small, clear set of rules and trust domains to operate within them?</p>
</li>
</ul>
<p>If every change requires escalation and exception handling, added autonomy will create more variance, not more value.</p>
<p>You need governance that works as <strong>guardrails</strong>, not as a gate for every decision.</p>
<p><strong>Action</strong>: Make governance work as guardrails with pre-approved paths that teams can adopt without asking permission.</p>
<h3 id="heading-people-and-skills">People and Skills</h3>
<p>Ask:</p>
<ul>
<li><p>Do domain teams have engineers or analytics engineers who can own pipelines end to end?</p>
</li>
<li><p>Do you have an enablement team that helps domains adopt shared practices and technology — coaching them, providing templates, and pairing on real work?</p>
</li>
</ul>
<p>If these skills sit only in one central team, you will either overload it or hand responsibility to people who cannot carry it.</p>
<p><strong>Action</strong>: Invest in skills and enablement before shifting significant lifecycle ownership into domains.</p>
<h2 id="heading-its-not-a-binary-choice-borrow-the-best-ideas"><strong>It’s Not a Binary Choice: Borrow the Best Ideas</strong></h2>
<p>You do not have to pick a side and live with it forever. Most healthy organizations end up with a central platform and varying degrees of decentralized ownership on top.</p>
<h3 id="heading-apply-mesh-ideas-on-a-central-platform">Apply Mesh Ideas on a Central Platform</h3>
<p>You do not need to re‑architect everything to benefit from data mesh thinking. Many of the ideas work well on top of a central lake or warehouse you already run.</p>
<p>In practice, the pattern that works is:</p>
<ul>
<li><p><strong>Keep the platform centralized.</strong></p>
<p>  One place to manage infrastructure, security, governance, and the golden path.</p>
</li>
<li><p><strong>Decentralize data products.</strong></p>
<p>  Domains own the tables, models, and APIs that represent their part of the business, on top of that shared platform.</p>
</li>
</ul>
<p>On that foundation, apply “mesh” ideas inside your lake or warehouse:</p>
<ul>
<li><p>Treat important datasets as <strong>products</strong>: give them clear owners, roadmaps, and SLAs.</p>
</li>
<li><p>Make contracts explicit: schemas, refresh cadence, and rules for breaking changes.</p>
</li>
<li><p>Bring domain experts into design and prioritization, instead of letting a central team guess what they need.</p>
</li>
</ul>
<p>You get leverage from a shared foundation, while pushing accountability closer to where decisions and knowledge live.</p>
<h3 id="heading-sequence-the-journey">Sequence the Journey</h3>
<p>You do not need, and likely do not want, a big‑bang transition.</p>
<p>A pragmatic sequence:</p>
<ol>
<li><p><strong>Stabilize a central lake or warehouse.</strong></p>
<p> Get reliability, governance, and observability to an acceptable baseline.</p>
</li>
<li><p><strong>Introduce product thinking and ownership.</strong></p>
<p> Name owners for key domains and define clear contracts.</p>
</li>
<li><p><strong>Gradually decentralize where it adds leverage.</strong></p>
<p> Let mature domains take on more of the lifecycle, on top of the shared platform.</p>
</li>
</ol>
<p>At each step, ask a simple question: <strong>can your organization operate this level of decentralization without creating chaos?</strong></p>
<h2 id="heading-you-dont-have-to-make-this-choice-alone"><strong>You Don’t Have to Make This Choice Alone</strong></h2>
<p>At DataChef, we have helped organizations design and evolve data platforms across warehouses, lakes, lakehouses and meshes. The patterns that work in practice are always the ones that start from the business.</p>
<p>The shift from a centralized to a more decentralized model does not happen overnight. You need a clear view of your structure, skills, and governance, and a roadmap that links operating model and technology to outcomes. That is why we look beyond tools and architectures and focus equally on organisational design and team boundaries, using approaches like Team Topologies</p>
<p>If you are considering a move towards data mesh or rethinking your central platform, we would be happy to help you assess where you are today and design a path you can truly own.</p>
]]></content:encoded></item><item><title><![CDATA[Beyond Dashboards: An Executive Introduction to Generative BI]]></title><description><![CDATA[For years, leaders have relied on dashboards and BI teams to understand what is happening in their organization. The rhythm has been familiar: you notice something unusual, ask someone to “pull a report,” and wait while the request moves through a ba...]]></description><link>https://blog.datachef.co/beyond-dashboards-an-executive-introduction-to-generative-bi</link><guid isPermaLink="true">https://blog.datachef.co/beyond-dashboards-an-executive-introduction-to-generative-bi</guid><category><![CDATA[#GenerativeBI ]]></category><category><![CDATA[#ConversationalAnalytics]]></category><category><![CDATA[BUSINESS INTELLIGENCE ]]></category><dc:creator><![CDATA[Bram Elfrink]]></dc:creator><pubDate>Thu, 04 Dec 2025 10:15:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1764840878007/620cf556-554a-45fb-8f27-f3b407f54d2d.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>For years, leaders have relied on dashboards and BI teams to understand what is happening in their organization. The rhythm has been familiar: you notice something unusual, ask someone to “pull a report,” and wait while the request moves through a backlog. Days or weeks later, you receive the chart you asked for. Sometimes it helps. Sometimes the moment has already passed.</p>
<p>A quiet shift is now underway. It is changing how leaders work with data, how quickly they make decisions, and how often they validate their intuition. That shift is <strong>Generative BI</strong>.</p>
<p>Despite the futuristic name, it is not a tool. It is a capability.</p>
<h2 id="heading-from-dashboards-to-conversation"><strong>From Dashboards to Conversation</strong></h2>
<hr />
<p>The simplest way to understand Generative BI is to compare two worlds. Imagine you are reviewing last month’s performance and notice that margin dipped in one region. In the old world, you would email your BI lead, ask for a breakdown by product and channel, and wait a few days for a new report. By the time it arrives, the team has already moved on to other priorities. In the new world, you ask: <em>“Why did margin fall in Region North last month?”</em> You immediately see the answer broken down by product line. You follow up with <em>“Show me the top three customers that changed the most compared to the previous quarter”</em> and <em>“How much of this is due to discounts?”</em> You get to a decision in minutes, while the topic is still fresh in your next leadership meeting.</p>
<p>That is Generative BI: a conversational layer on top of the data and models you already have, designed to unlock faster thinking and faster decisions. This does not replace analytics. Dashboards remain essential for recurring KPIs and long‑term performance tracking. What changes is the interface. Executives and domain experts no longer need to file tickets for every ad‑hoc question. They can simply ask and explore.</p>
<h2 id="heading-why-executives-should-care"><strong>Why Executives Should Care</strong></h2>
<hr />
<p>The value of Generative BI has little to do with AI as a buzzword. Its impact is felt in the pace and quality of decision‑making.</p>
<p><strong>Faster, More Accountable Decisions</strong></p>
<p>Instead of waiting for someone to build or modify a report, leaders can get to the <em>why</em> in real time, directly in conversation with their data. This is powerful in areas where timing compounds results: pricing, campaign optimization, supply chain adjustments, store or channel operations. Being able to ask “what changed?” at the moment you notice something creates a fundamentally different decision cadence.</p>
<p>At the same time, when the path from question to answer is transparent – which data was used, which definitions were applied, which assumptions were made – it becomes easier to challenge, refine and document decisions. Generative BI does not just speed up decision‑making. It makes the link between narrative, numbers and ownership much clearer.</p>
<p><strong>A Better Return on Data Investments</strong></p>
<p>Most organizations have already invested heavily in warehouses, modeling and BI tools. Generative BI does not replace that work. It helps people finally use more of what they already have, by making it far easier to get from question to insight.</p>
<p><strong>Relief for the BI Backlog</strong></p>
<p>BI teams are excellent at producing repeatable dashboards. What they cannot cover is the long tail of everyday questions that never get prioritized, simply because nobody has time to build a view for them. Generative BI absorbs that long tail. It answers the quick, exploratory questions that today either never get asked or arrive as “just one more request” for the analytics team.</p>
<p><strong>More Curiosity Across the Organization</strong></p>
<p>When people do not need to open tickets or wait for sprints, they ask more questions. They explore more scenarios. They validate more hypotheses. Many of the best ideas in an organization start with someone noticing a small anomaly and digging deeper. Generative BI makes that investigation immediate, instead of dependent on the next reporting cycle.</p>
<h2 id="heading-the-honest-limitations"><strong>The Honest Limitations</strong></h2>
<hr />
<p>Like any powerful capability, Generative BI has real constraints. Being clear about them is the difference between a valuable system and a disappointing pilot.</p>
<p><strong>Your Data Still Needs to Be Right</strong></p>
<p>If definitions are inconsistent or tables do not line up, the system will give you answers quickly, but they may be confidently wrong. Generative BI exposes weak data foundations. It does not repair them.</p>
<p><strong>Nuanced Business Concepts Are Not Obvious to a Model</strong></p>
<p>The system can easily answer structured questions such as trends, comparisons or top‑N performers. It struggles when underlying concepts are ambiguous or live inside people’s heads. Terms like “active customer,” “at‑risk account,” or “adjusted margin” often require shared understanding across teams. Unless those definitions are made explicit, a model cannot apply them reliably.</p>
<p><strong>People Still Need to Trust the Numbers</strong></p>
<p>Executives will always ask where a number came from. A good Generative BI setup must show how an answer was produced: which tables it touched, which metrics it used and what logic it applied. Without that transparency, leaders will treat the system as a toy rather than a decision support tool.</p>
<p><strong>Governance Matters More, Not Less</strong></p>
<p>A conversational interface makes it easy for anyone to ask anything. Without the right guardrails, people may access information they should not see or draw conclusions from partial context. Thoughtful governance, access control and guardrails are what turn Generative BI from a risk into a trusted capability.</p>
<p>Put simply: Generative BI will not magically fix weak data foundations or misaligned definitions. It amplifies whatever you already have, good or bad.</p>
<h2 id="heading-making-generative-bi-work-in-the-real-world"><strong>Making Generative BI Work in the Real World</strong></h2>
<hr />
<p>The hard part of Generative BI is rarely the interface. Most demos look magical. The real challenge is making it work reliably inside the messy, nuanced reality of a business.</p>
<p>This is where DataChef focuses: turning promising generative BI demos into a durable capability on top of your real data.</p>
<p><strong>From Demo to Capability</strong></p>
<p>Real data is messy. Business rules clash. Metrics drift over time. DataChef works with both domain leaders and your data/BI teams to get a complete view: not just how the data is modeled, but how the business actually makes decisions. We use focused workshops (for example, event storming, value stream mapping, and Wardley mapping) with a multi‑disciplinary group of client stakeholders. Together, we identify the critical decisions you want to support in conversation, then shape the underlying data and definitions around those questions. The goal is not a one‑off proof of concept, but a capability that executives can return to every day.</p>
<p><strong>Designing a Safe Environment for Exploration</strong></p>
<p>We work with you to decide which domains, metrics and tables should be accessible first. Together, we make explicit which sources are in scope, which definitions are authoritative, and which caveats must always be surfaced. On top of that, we put access controls and audit in place. The result is a “safe playground” where executives and domain experts can explore without risking misinterpretation or exposure of sensitive data.</p>
<p><strong>An Iterative, Low-Risk Approach</strong></p>
<p>You do not need a big‑bang transformation. We prefer a <em>thin vertical slice (or tracer bullet)</em> approach: prove the value end‑to‑end as quickly as possible by taking a narrow slice, for example a specific sub‑domain. That slice includes everything from the data model and definitions to the conversational experience and governance. This way, you get feedback on every part of the solution early, and you can expand based on real usage and learning rather than on abstract requirements.</p>
<h2 id="heading-what-this-means-for-your-organization"><strong>What This Means for Your Organization</strong></h2>
<hr />
<p>Generative BI is not the end of dashboards, and it is not a replacement for your analytics team. It is a new capability that changes how quickly leaders can think, explore ideas and act with confidence.</p>
<p>For organizations ready to embrace it, the shift is profound. It allows decision‑makers to move at the speed of conversation, supported by data they already own.</p>
<p>If you are curious what this could look like for your company, DataChef helps organizations design that first step: a secure, trusted conversational layer that unlocks meaningful, fast insights for leaders.</p>
]]></content:encoded></item><item><title><![CDATA[How We Turned Onboarding Into Something Magical Using n8n]]></title><description><![CDATA[Bringing a new colleague into DataChef has always felt a bit like welcoming someone into our kitchen. You want the place to feel warm and ready, and you want their first steps to feel smooth. For a long time, our onboarding process did not give that ...]]></description><link>https://blog.datachef.co/how-we-turned-onboarding-into-something-magical-using-n8n</link><guid isPermaLink="true">https://blog.datachef.co/how-we-turned-onboarding-into-something-magical-using-n8n</guid><category><![CDATA[n8n]]></category><category><![CDATA[workflow]]></category><category><![CDATA[onboarding]]></category><category><![CDATA[automation]]></category><dc:creator><![CDATA[Mohsen Hasani]]></dc:creator><pubDate>Mon, 01 Dec 2025 15:32:37 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/4NQEvxW2_4w/upload/bd6bafd24b907b85f61744e020c8ac15.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Bringing a new colleague into <strong>DataChef</strong> has always felt a bit like welcoming someone into our kitchen. You want the place to feel warm and ready, and you want their first steps to feel smooth. For a long time, our onboarding process did not give that feeling. It was slow, manual, and full of tiny tasks that could be missed. Creating emails, preparing pages, writing messages, setting up channels, checking access. None of it was wrong, but it did not match the friendly and organized culture we wanted to show.</p>
<p>At some point we realized something simple. We spend a lot of time welcoming people, but the work behind that welcome is heavy and invisible. We wanted to keep the warm feeling while removing the heavy lifting around it. So we built our own onboarding workflow in n8n, and the difference was bigger than we expected. Today the entire onboarding process runs in seconds. It feels consistent, personal, and honestly a little magical.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1764601120582/0e2329e2-66db-4fb7-992a-dbddf1445abc.webp" alt class="image--center mx-auto" /></p>
<p><strong>Screenshot: The full onboarding workflow in n8n</strong></p>
<h2 id="heading-1-a-tiny-slack-form-that-starts-everything"><strong>1. A tiny Slack form that starts everything</strong></h2>
<p>It all begins when someone from PeopleOps fills in a short Slack form. It is simple and quick, containing only the basic details like name, position, and start date. Behind this small step, the workflow wakes up and starts preparing everything in the background. It feels like ringing a small bell in the kitchen and watching everything fall into place without any stress. This one form replaces hours of careful setup and long checklists.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1764600990502/19ac19f4-58f7-4fb2-9c7b-6915f255c343.webp" alt class="image--center mx-auto" /></p>
<p><strong>Screenshot: Slack shortcut trigger</strong></p>
<h2 id="heading-2-the-new-colleagues-digital-identity-appears-like-magic"><strong>2. The new colleague’s digital identity appears like magic</strong></h2>
<p>Right after the form is submitted, n8n starts building the new colleague’s digital identity. It checks for an available email address, chooses one, creates it, generates a temporary password, and connects their personal email. Before automation, this took careful typing and double checking. Now the process finishes almost instantly. Watching it happen feels like someone quietly preparing the perfect setup behind the scenes.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1764601024563/e763f8ed-a79a-4060-8937-00104c10b838.webp" alt class="image--center mx-auto" /></p>
<p><strong>Screenshot: Email creation section</strong></p>
<h2 id="heading-3-their-notion-space-grows-around-them"><strong>3. Their Notion space grows around them</strong></h2>
<p>Notion is our central place for documentation and shared knowledge. We wanted new colleagues to arrive and instantly feel that everything was ready for them. The workflow creates a DOP (DataChef Orientation Program) page to guide their first days and a One Team page that stores their basic information. Both pages come filled with the right layout and details. There are no empty pages or forgotten fields. Everything looks prepared with intention and care.</p>
<h2 id="heading-4-a-warm-welcome-message-lands-in-their-inbox"><strong>4. A warm welcome message lands in their inbox</strong></h2>
<p>Before they even join Slack, the new colleague receives a friendly welcome email. It includes their company email, a temporary password, and simple next steps. This is often the first moment of surprise for them. It shows that the company has already prepared a space, and all they need to do is walk in.</p>
<h2 id="heading-5-n8n-patiently-waits-for-them-to-join-slack"><strong>5. n8n patiently waits for them to join Slack</strong></h2>
<p>This part has a calm and almost playful feeling. The workflow does not rush or create channels before the person actually joins Slack. Instead, it waits and checks from time to time whether the new account has appeared through SSO. When the new colleague finally signs in, n8n continues the process instantly, almost as if saying, “Welcome. Let us continue.”</p>
<h2 id="heading-6-their-slack-channels-appear-fully-ready"><strong>6. Their Slack channels appear fully ready</strong></h2>
<p>Once the new colleague is active in Slack, things start to appear around them. Their DOP channel is created, a progress review channel appears, clear instructions are posted, reminders are sent, and the right people are invited. Before automation, these steps were done one by one and not always on time. Now everything is ready exactly when it should be. This gives the team time to focus on the human side of onboarding, sending greetings and welcoming the new colleague into the kitchen.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1764601063826/78e72a8a-e879-4132-9639-9dfa6424b4e9.webp" alt class="image--center mx-auto" /></p>
<p><strong>Screenshot: Slack channel automation</strong></p>
<h2 id="heading-7-engineers-get-github-access-without-delay"><strong>7. Engineers get GitHub access without delay</strong></h2>
<p>For engineers, the workflow includes one more useful step. It asks them for their GitHub username directly inside Slack. They respond, and n8n updates our GitHub Terraform repo and gives them the correct access. There are no delays and no forgotten requests. It is fast and reliable.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1764601074244/182de229-8404-4adf-bc52-740a17578322.webp" alt class="image--center mx-auto" /></p>
<p><strong>Screenshot: GitHub automation</strong></p>
<h2 id="heading-8-peopleops-finishes-with-a-human-touch"><strong>8. PeopleOps finishes with a human touch</strong></h2>
<p>Even with all this automation, we still keep an important human moment at the end. PeopleOps receives a review form that covers hardware, internal tools, and a few things that still need personal attention. This keeps the balance right. Automation takes care of the predictable work, and people focus on the meaningful parts of onboarding.</p>
<h2 id="heading-the-result-feels-peaceful-simple-and-very-datachef"><strong>The result feels peaceful, simple, and very DataChef!</strong></h2>
<p>What used to be a long list of steps is now a calm and clear experience. New colleagues feel supported from the first moment. The team feels more relaxed. PeopleOps has time for the warm, personal parts of onboarding instead of repeating small tasks. Onboarding used to be something we had to manage carefully. Now it is something we enjoy watching. It feels like the company itself reaches out and says, “Welcome to the kitchen 🧑‍🍳, everything is ready for you!”</p>
]]></content:encoded></item><item><title><![CDATA[Fast Flow Conf 2025: Team Topologies and the Language of Flow]]></title><description><![CDATA[Consider this paradox for a second: imagine studying grammar for a language that doesn’t exist. You know all the rules in theory, yet you can’t communicate with anyone. After attending Fast Flow Conference in London earlier this week, that metaphor c...]]></description><link>https://blog.datachef.co/fast-flow-conf-2025-team-topologies-language-of-flow</link><guid isPermaLink="true">https://blog.datachef.co/fast-flow-conf-2025-team-topologies-language-of-flow</guid><category><![CDATA[team topologies]]></category><category><![CDATA[flow]]></category><category><![CDATA[team collaboration]]></category><category><![CDATA[organization]]></category><dc:creator><![CDATA[Davide Rovati]]></dc:creator><pubDate>Fri, 17 Oct 2025 12:22:19 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/BVyNlchWqzs/upload/502fa02008d2832471fc93b01d640c8f.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Consider this paradox for a second: imagine studying grammar for a language that doesn’t exist. You know all the rules in theory, yet you can’t communicate with anyone. After attending Fast Flow Conference in London earlier this week, that metaphor came to my mind for those organizations that try to introduce Team Topologies as a silver bullet without fully assessing the sociotechnical system around them.</p>
<p>Leaders with reductionist tendencies are always looking for a formula, especially in the world of technology teams. Some of them may have thought they found one in the rigid, comforting set of options in the 1st book published by Skelton and Pais — a grammar, indeed, to label teams and their interactions.</p>
<p>Team Topologies practitioners use those team types and interaction modes both in a descriptive way (to depict an organization’s current state) and in a prescriptive way (to design its intended state). But if those reductionist leaders had been in the room with us at Fast Flow Conference, they would have realized that such patterns were never meant to be used in isolation.</p>
<p>Over the years, the Team Topologies community has expanded its toolkit. It now includes old and new methodologies, critical to develop a much deeper understanding of the dynamics of how organizations evolve over time. I call this toolkit the <strong>Language of Flow</strong>. Team types and interaction modes constitute the grammar of that language.</p>
<p>Over two days of exciting talks, I heard success stories highlighting the importance of EventStorming, Wardley mapping, value stream mapping, and systems thinking — a set of collaborative techniques that complement each other by putting humans at the center of the stage. I saw a strong emphasis on the role of enabling teams and on the principle of sharing knowledge across silos. I attended workshops where platform engineers were given discovery tools drawn directly from the best product management practices. And I saw two key figures in the Team Topologies world, Matthew Skelton and Joao Rosa, delivering talks that broadened everyone’s perspective by zooming out from the narrow focus on engineering teams to discuss strategic topics such as budgeting, economies of scale vs economies of ‘empowerment’, compliance, and more.</p>
<p>None of that felt forced or like a hard sell, and that’s precisely because all the speakers share the same Language of Flow, a toolkit that can reshape organizations by thinking human-first. The original patterns of Team Topologies are a core part of that language, acting almost like an orchestrator, but they lose their transformative power when they are introduced in an organization that is not fluent in the Language of Flow.</p>
<p>If you thought applying the grammar rules to your teams was enough to change how your organization works, you've missed the bigger picture. You need to introduce the Language of Flow as a whole.</p>
<p>DataChef has a proven history of being an agent of change in companies such as PostNL, CarNext, Rituals, and Allseas. Reach out if you need to accelerate your journey.</p>
]]></content:encoded></item><item><title><![CDATA[Agent/Non-Agent based monitoring & Distributed tracing]]></title><description><![CDATA[Introduction
When monitoring applications and infrastructure, businesses usually choose between agent-based and non-agent based monitoring solutions. Some tools, such as Datadog or Splunk, can use agent-based approaches, while both also support non-a...]]></description><link>https://blog.datachef.co/agentnon-agent-based-monitoring-and-distributed-tracing</link><guid isPermaLink="true">https://blog.datachef.co/agentnon-agent-based-monitoring-and-distributed-tracing</guid><category><![CDATA[monitoring]]></category><category><![CDATA[distributed tracing]]></category><dc:creator><![CDATA[Farbod Ahmadian]]></dc:creator><pubDate>Wed, 20 Aug 2025 12:40:07 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1754389379096/b4b1a08c-2606-4d9d-892e-8e65f41a75fd.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>When monitoring applications and infrastructure, businesses usually choose between agent-based and non-agent based monitoring solutions. Some tools, such as Datadog or Splunk, can use agent-based approaches, while both also support non-agent methods. In this article, we’ll use Datadog as an example of agent-based monitoring and Splunk as an example of non-agent-based monitoring, to show the differences in practice. Each approach has its strengths and challenges, depending on needs, environments, and monitoring goals.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong>Disclaimer:</strong> Both Datadog and Splunk support <strong>agent-based</strong> and <strong>non-agent-based</strong> monitoring approaches. The examples in this article use Datadog primarily to illustrate agent-based monitoring and Splunk to illustrate non-agent-based monitoring. This is for clarity in comparing the concepts, not to suggest that one tool is limited to only one method.</div>
</div>

<h2 id="heading-agent-based-monitoring">Agent-Based Monitoring</h2>
<p><strong>Agent-based monitoring</strong> means installing a small program (an <em>agent</em>) on each server or host that you want to monitor.</p>
<h3 id="heading-how-it-works">How it works</h3>
<ul>
<li><p>An agent runs continuously on the host machine.</p>
</li>
<li><p>It collects metrics like CPU, memory, disk usage, application logs, and more.</p>
</li>
<li><p>Data is sent to a central server (like DataDog's cloud) for analysis.</p>
</li>
</ul>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> datadog <span class="hljs-keyword">import</span> initialize, statsd

<span class="hljs-comment"># Initialize (if sending directly to local agent, defaults are ok)</span>
options = {<span class="hljs-string">"statsd_host"</span>:<span class="hljs-string">"localhost"</span>, <span class="hljs-string">"statsd_port"</span>:<span class="hljs-number">8125</span>}
initialize(**options)

<span class="hljs-comment"># Send a gauge metric (e.g., app.queue.size=5)</span>
statsd.gauge(<span class="hljs-string">"app.queue.size"</span>, <span class="hljs-number">5</span>)
</code></pre>
<h3 id="heading-pros">Pros</h3>
<ul>
<li><p><strong>Rich Data Collection:</strong> Agents can collect detailed performance metrics and logs directly from the machine or container.</p>
</li>
<li><p><strong>Real-Time Monitoring:</strong> Since agents run locally, they can send data almost instantly.</p>
</li>
<li><p><strong>Custom Checks:</strong> You can configure agents to do extra checks, like custom scripts.</p>
</li>
<li><p><strong>Easy Auto-Discovery:</strong> Agents can sometimes auto-detect services and start monitoring them automatically.</p>
</li>
</ul>
<h3 id="heading-cons">Cons</h3>
<ul>
<li><p><strong>Deploy and Maintain:</strong> Every machine or container needs an agent installed and kept up to date.</p>
</li>
<li><p><strong>Resource Usage:</strong> Agents use a bit of the host’s CPU and memory, though this is usually small.</p>
</li>
<li><p><strong>Compatibility:</strong> Some environments (like highly restricted or legacy systems) may not allow agent installation.</p>
</li>
</ul>
<h2 id="heading-non-agent-based-monitoring">Non-Agent Based Monitoring</h2>
<p><strong>Non-agent based monitoring</strong> collects data without installing anything on the host. A central system pulls in data, often by receiving logs or metrics through APIs, syslog, or other protocols.</p>
<h3 id="heading-how-it-works-1">How it works</h3>
<ul>
<li><p>Systems send their log files, events, or performance data to a central collector (like Splunk).</p>
</li>
<li><p>No agents run on the host; configuration often happens on log shippers or using built-in system protocols.</p>
</li>
</ul>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> requests
<span class="hljs-keyword">import</span> json

splunk_url = <span class="hljs-string">"&lt;https://splunk-server:8088/services/collector&gt;"</span>
headers = {
    <span class="hljs-string">"Authorization"</span>: <span class="hljs-string">"Splunk YOUR_HEC_TOKEN"</span>
}
data = {
    <span class="hljs-string">"event"</span>: <span class="hljs-string">"metric"</span>,  <span class="hljs-comment"># event type</span>
    <span class="hljs-string">"fields"</span>: {
        <span class="hljs-string">"metric_name:app.queue.size"</span>: <span class="hljs-number">5</span>
    }
}

<span class="hljs-comment"># Send the data to Splunk</span>
requests.post(splunk_url, headers=headers, data=json.dumps(data), verify=<span class="hljs-literal">False</span>)
</code></pre>
<h3 id="heading-pros-1">Pros</h3>
<ul>
<li><p><strong>No Agent Management:</strong> Nothing to install or update on the monitored systems.</p>
</li>
<li><p><strong>Good for Legacy/Restricted Systems:</strong> Useful where you cannot install extra software.</p>
</li>
<li><p><strong>Centralized Control:</strong> All settings and updates occur on the collector’s side.</p>
</li>
</ul>
<h3 id="heading-cons-1">Cons</h3>
<ul>
<li><p><strong>Limited Data:</strong> Sometimes, only basic metrics or logs are available unless the system supports rich exports.</p>
</li>
<li><p><strong>Slower or Batch Updates:</strong> Data may arrive in batches, so insights may lag behind real-time.</p>
</li>
<li><p><strong>Harder Customization:</strong> Custom health checks or metrics are harder to set up.</p>
</li>
</ul>
<h2 id="heading-distributed-tracing"><strong>Distributed Tracing</strong></h2>
<h3 id="heading-what-is-distributed-tracing">What is distributed tracing?</h3>
<p><strong>Distributed tracing</strong> helps developers follow the journey of a user or API request as it moves through different services in a system. Each step the request takes is called a “span.” Each span gets a <em>span ID</em> (unique identifier).</p>
<h3 id="heading-where-is-it-used">Where is it used?</h3>
<ul>
<li><p><strong>Microservices:</strong> When applications have many small services talking to each other.</p>
</li>
<li><p><strong>Serverless and Cloud-Native Apps:</strong> Where requests touch multiple services.</p>
</li>
<li><p><strong>Debugging Performance Issues:</strong> To find bottlenecks in big, complex systems.</p>
</li>
</ul>
<h3 id="heading-how-is-it-used">How is it used?</h3>
<ul>
<li><p>When a request starts, a trace ID and a span ID are created.</p>
</li>
<li><p>As the request travels, new spans and IDs are made for each new service or step.</p>
</li>
<li><p>All the spans are grouped together under the single trace ID.</p>
</li>
<li><p>This lets you see the entire path—the trace—of a request, how long each step takes, and where failures happen.</p>
</li>
</ul>
<h3 id="heading-span-id">Span ID</h3>
<ul>
<li><p>Every operation in the trace has its own <em>span ID</em>.</p>
</li>
<li><p>The span ID helps to track and organize all steps within a single trace.</p>
</li>
<li><p>By analyzing span IDs, you can see how long each segment took and how services relate to each other.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754389681548/cb428bd5-6a96-4ec7-b9cf-5eb0ba55df3c.jpeg" alt class="image--center mx-auto" /></p>
<h2 id="heading-when-to-use-each-monitoring-solution">When to use each monitoring solution</h2>
<ul>
<li><p><strong>Visualization:</strong> DataDog has easy-to-use dashboards to view full traces and performance of each span.</p>
</li>
<li><p><strong>Friendly for Developers:</strong> Many programming languages and frameworks are supported with minimal manual work.</p>
</li>
<li><p><strong>Built-In Agent Support:</strong> DataDog’s agent natively collects distributed traces together with logs and metrics. This makes setup simpler and deeper.</p>
</li>
<li><p><strong>Real-Time Tracing:</strong> The agent sends tracing data in real time, which helps with quick debugging.</p>
</li>
<li><p><strong>Automatic Context Linking:</strong> Traces, logs, and metrics are tied together, making it easier to investigate issues.</p>
</li>
<li><p><strong>More Control</strong>: If you want to have more control over OpenTelemetry abstraction, Splunk can be a better option. Although the <a target="_blank" href="https://github.com/signalfx/splunk-otel-python#readme">OpenTelemetry library</a> is still under development.</p>
</li>
</ul>
<div data-node-type="callout">
<div data-node-type="callout-emoji">✅</div>
<div data-node-type="callout-text">Whether agent-based or non-agent-based monitoring is a better fit depends on your environment, security requirements, and operational trade-offs.</div>
</div>

<p>By contrast, while Splunk can handle traces through external plugins or integrations, it often requires more manual setup and may not offer real-time or tightly integrated tracing experiences out-of-the-box.</p>
<p>Simple Python example of how distributed tracing can be done in DataDog:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> ddtrace <span class="hljs-keyword">import</span> tracer, patch_all

patch_all()

<span class="hljs-meta">@tracer.wrap()</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">say_hello</span>():</span>
    print(<span class="hljs-string">"Hello from DataDog tracing!"</span>)

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    say_hello()
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754392182257/2376096d-f182-45c7-a18e-d5dd5a129711.png" alt class="image--center mx-auto" /></p>
<p>Same example with Splunk:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> opentelemetry <span class="hljs-keyword">import</span> trace
<span class="hljs-keyword">from</span> opentelemetry.sdk.trace <span class="hljs-keyword">import</span> TracerProvider
<span class="hljs-keyword">from</span> opentelemetry.sdk.trace.export <span class="hljs-keyword">import</span> BatchSpanProcessor
<span class="hljs-keyword">from</span> opentelemetry.exporter.otlp.proto.http.trace_exporter <span class="hljs-keyword">import</span> OTLPSpanExporter

otlp_exporter = OTLPSpanExporter(endpoint=<span class="hljs-string">"&lt;https://ingest.us1.signalfx.com/v2/trace&gt;"</span>, headers={
    <span class="hljs-string">"X-SF-TOKEN"</span>: <span class="hljs-string">"your-splunk-access-token"</span>
})

trace.set_tracer_provider(TracerProvider())
span_processor = BatchSpanProcessor(otlp_exporter)
trace.get_tracer_provider().add_span_processor(span_processor)

tracer = trace.get_tracer(__name__)

<span class="hljs-keyword">with</span> tracer.start_as_current_span(<span class="hljs-string">"splunk-example-span"</span>):
    print(<span class="hljs-string">"Hello from Splunk tracing!"</span>)
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<ul>
<li><p>Both Datadog and Splunk offer agent-based and agentless options. The choice is less about the tool and more about your monitoring strategy: agent-based for richer, real-time detail, and non-agent for environments where agents aren’t possible.</p>
</li>
<li><p>For distributed tracing, Datadog offers strong out-of-the-box support, while Splunk emphasizes flexibility and standards like OpenTelemetry.</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Why your LLM needs its own version of Monitoring]]></title><description><![CDATA[TL;DR: LLMs don’t behave like normal services. They’re creative, a bit moody, and change under the hood without warning. Left alone, they surprise you at the worst time. You need LLMOps Monitoring: evals to measure quality, tracing to debug runs, and...]]></description><link>https://blog.datachef.co/why-llm-monitoring-llmops</link><guid isPermaLink="true">https://blog.datachef.co/why-llm-monitoring-llmops</guid><category><![CDATA[llm]]></category><category><![CDATA[#llmops]]></category><category><![CDATA[monitoring]]></category><dc:creator><![CDATA[Ali Yazdizadeh]]></dc:creator><pubDate>Fri, 15 Aug 2025 12:42:31 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1755261654052/4bac505e-f76d-405b-a9f1-a616654784a8.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>TL;DR:</strong> LLMs don’t behave like normal services. They’re creative, a bit moody, and change under the hood without warning. Left alone, they surprise you at the worst time. You need <strong>LLMOps Monitoring</strong>: evals to measure quality, tracing to debug runs, and clear views on cost and latency. Do this and you stop learning about problems from angry users.</p>
<p>A web API is (mostly) predictable. An LLM is not. Tiny changes in prompt, model version, or context produce different outcomes. Vendors swap model weights. Your knowledge base drifts. Here are some examples of how things can go wrong with LLMs in spectacular ways!</p>
<p><strong>“The Friday Rollout That Looked Fine in Staging”  
</strong>You upgrade a prompt and bump the temperature from 0.2 → 0.4 to make your assistant feel friendlier. Unit tests still pass. By Monday, support volume is up 40% because:</p>
<ul>
<li><p>The assistant started adding confident-but-wrong “extra context.”</p>
</li>
<li><p>Token counts rose ~25% due to chattier answers.</p>
</li>
<li><p>Latency crossed your SLO during peak hours.</p>
</li>
</ul>
<p>Here’s another classic:</p>
<p><strong>“Temporaly Confused Bot”</strong></p>
<p>Your chatbot answers tax questions. A user asks: “What’s the current VAT rate in the UK?” The model reads an old PDF from 2019 and replies with a past rate. The user posts the wrong answer on social. Support wakes you up. Fun times.</p>
<p>But how to avoid such “user report → war room → roll back” drama?</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755261423288/ada530b7-8ac7-409a-a2b4-8f7b183639f3.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-evals-measure-quality-in-plain-numbers"><strong>Evals: measure quality in plain numbers</strong></h2>
<p>Evals turn “feels right” into metrics you can track. Do both <strong>offline</strong> (pre-deployment) and <strong>online</strong> (shadow tests or sampled live traffic).</p>
<p><strong>Factual Accuracy  
</strong>Test if answers match your provided docs. Build small Q&amp;A sets per collection and score with a simple rubric: “fully supported,” “partially,” “not supported.” Log which passages the model cites.</p>
<p><strong>Temporal Accuracy  
</strong>Test if the model gives the <em>current</em> truth. Tag questions as time-sensitive. Keep a small table of ground truth with dates (e.g., “VAT = 20% as of 2025-01-01”). If the answer is old or hedges, it fails.</p>
<p>Here are some other evals that can be useful:</p>
<ul>
<li><p><strong>Format Validity:</strong> Is the JSON valid? Does it match your schema?</p>
</li>
<li><p><strong>Safety &amp; PII</strong>: Refuses risky requests; redacts emails/IDs when required.</p>
</li>
<li><p><strong>RAG Faithfulness:</strong> Is the answer supported by retrieved text? Penalize made-up facts.</p>
</li>
</ul>
<h2 id="heading-tracing"><strong>Tracing:</strong></h2>
<ul>
<li><h2 id="heading-make-every-run-debuggable-logging"><strong>Make every run debuggable (Logging)</strong></h2>
</li>
</ul>
<p>When something goes odd, you need a clear trail from input to output. This is on top of a classic logging system since LLMs have new requirements like what model config was used.</p>
<p>Here is a suggested list of what to record on every request:</p>
<ul>
<li><p>Prompt version and all variables filled in</p>
</li>
<li><p>Model name, temperature, max tokens</p>
</li>
<li><p>Retrieval query, top-K, and the <strong>actual</strong> chunks shown to the model</p>
</li>
<li><p>Every tool call: name, args, result, errors</p>
</li>
<li><p>Guardrail checks and why they passed/failed</p>
</li>
<li><p>Token counts (prompt, completion)</p>
</li>
<li><p>Latency per step and end-to-end</p>
</li>
<li><p>Caching, retries, and fallbacks taken</p>
</li>
<li><h2 id="heading-no-surprises-on-the-bill-cost-monitoring"><strong>No surprises on the bill (Cost Monitoring)</strong></h2>
</li>
</ul>
<p>LLM cost is mostly tokens and tool calls. It can spike fast.</p>
<p>Track spend by <strong>model, endpoint, feature, team, and user</strong>. Watch:</p>
<ul>
<li><p>Tokens per request, and per successful task</p>
</li>
<li><p>Cost per 1K requests and per solved ticket</p>
</li>
<li><p>Tool call density (some chains spam tools)</p>
</li>
<li><p>Cache hit rate (misses are expensive)</p>
</li>
</ul>
<p>Add budget alerts and soft limits. Route simple tasks to a smaller model. Use a “tiny prefilter → big model on hard cases” path. Compress context, prune long histories, and store reusable summaries.</p>
<ul>
<li><h2 id="heading-fast-models-feel-smarter-latency"><strong>Fast models feel smarter (Latency)</strong></h2>
</li>
</ul>
<p>Speed is part of quality. Users judge the answer and the wait.</p>
<p>Measure latency at each hop:</p>
<ul>
<li><p>Retrieval time</p>
</li>
<li><p>Model time</p>
</li>
<li><p>Tool time (each call)</p>
</li>
<li><p>End-to-end time, with p50/p95/p99</p>
</li>
</ul>
<p>Tune by caching hot results, prefetching facts, lowering top-K, streaming tokens to the UI, and moving slow tools off the critical path. Set alerts on p95/p99, not just averages.</p>
<h2 id="heading-tooling-options"><strong>Tooling Options:</strong></h2>
<p>You can build from scratch, but most teams mix in ready tools for LLMOps. Here are some popular picks:</p>
<table><tbody><tr><td><p><strong>Tool</strong></p></td><td><p><strong>OSS</strong></p></td><td><p><strong>Focus</strong></p></td><td><p><strong>Hosting</strong></p></td><td><p><strong>Why teams pick it</strong></p></td></tr><tr><td><p><strong>Langfuse</strong></p></td><td><p><strong>Yes</strong></p></td><td><p><strong>Eval + Tracing</strong></p></td><td><p>Self-host &amp; Cloud</p></td><td><p>Popular, sharp trace UI, good SDKs, easy prompt/version tracking.</p></td></tr><tr><td><p><strong>Helicone</strong></p></td><td><p><strong>Yes</strong></p></td><td><p><strong>Eval + Tracing</strong></p></td><td><p>Self-host &amp; Cloud</p></td><td><p>Proxy-style drop-in; strong cost/latency views across providers.</p></td></tr><tr><td><p><strong>Humanloop</strong></p></td><td><p><strong>Yes</strong></p></td><td><p><strong>Eval-centric</strong></p></td><td><p>Self-host &amp; Cloud</p></td><td><p>Great for dataset curation, rubric design, human review loops.</p></td></tr><tr><td><p><strong>LangSmith</strong></p></td><td><p>Not fully OSS</p></td><td><p><strong>Eval + Tracing</strong></p></td><td><p>Cloud (plus enterprise options)</p></td><td><p>Deep integration with LangChain pipelines and tools.</p></td></tr></tbody></table>

<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755261245921/eb3516e1-f814-4aa0-97e9-c79c55291ae0.png" alt class="image--center mx-auto" /></p>
<blockquote>
<p>Tip: Most teams start with <strong>Langfuse</strong> (or <strong>Helicone</strong>) for traces + basic evals, add <strong>Humanloop</strong> for richer human-in-the-loop workflows, and integrate <strong>LangSmith</strong> if they’re already heavy on LangChain.</p>
</blockquote>
<h2 id="heading-wrap-up"><strong>Wrap-up</strong></h2>
<p>LLMs are powerful, but they wander. LLMOps Monitoring keeps them in bounds. With evals, tracing, and clear cost/latency views, you find issues before your users do. Start small, measure what matters, and keep your model on a short, friendly leash.</p>
]]></content:encoded></item><item><title><![CDATA[Designing the "Blank Slate" of an organization]]></title><description><![CDATA[The first time you open an application you just installed, you might see several frames and components that look empty. These will fill up with your data as you start using the application. Designers call this initial state a "Blank Slate."
It's now ...]]></description><link>https://blog.datachef.co/designing-the-blank-slate-organization</link><guid isPermaLink="true">https://blog.datachef.co/designing-the-blank-slate-organization</guid><category><![CDATA[organization]]></category><category><![CDATA[leadership]]></category><category><![CDATA[management]]></category><category><![CDATA[transformation]]></category><dc:creator><![CDATA[Davide Rovati]]></dc:creator><pubDate>Mon, 14 Jul 2025 14:45:13 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/x9WZ3aMFKME/upload/4a1f4e09c447b259e071d555f8ee4ed0.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The first time you open an application you just installed, you might see several frames and components that look empty. These will fill up with your data as you start using the application. Designers call this initial state a "Blank Slate."</p>
<p>It's now widely recognized that neglecting to design the blank slate is a big mistake because it's the user's <strong>first impression</strong> of your product. You might have the best user interface when the app is filled with data, but users may never reach that stage if it's not clear how to interact with the application when it's empty.</p>
<p>This is a fascinating concept, and as I was reflecting on how organizations evolve, I thought that many business leaders could take a page from the designer’s book about the importance of “blank slates” in their plans. Sure, when you step into a new role, tasked with transforming a department, it’s crucial to have a vision for how teams will interact in the future, which roles to introduce, which profiles to hire, and who will be responsible for what. However, a leader should not bypass the design of the “blank slate”, as in, the snapshot of what the organization looks like at the beginning of this transformation.</p>
<p>Too often, leaders can’t resist the temptation of dictating their vision from an ivory tower, while hiring a layer of smart middle-managers and delegating the execution of the plan to them. But <strong>even the most brilliant hires will get lost in the mess</strong> if they are shown a picture that only exists in the head of their enlightened boss. Imagine welcoming a new engineer into a maze of legacy services nobody remembers how to sunset. She receives a visionary slide describing a “data mesh” utopia, yet her onboarding checklist is full of monoliths that refuse to die.</p>
<p>In the gulf between present reality and promised future, talent grows frustrated, momentum stalls, and <strong>politics</strong> fill the gaps. Legacy services keep getting more and more consumers without a clear horizon for deprecation. Teams and people who struggle to see themselves in the future picture fight to stay relevant, trying to expand their area of influence until someone with a mandate tells them to stop.</p>
<p><strong>Designing that organizational blank slate means scripting the first moves, not just day‑three‑hundred outcomes.</strong> Leaders should map out temporary but explicit ownership zones, identify what must be kept running at all costs, and label the placeholders that are expected to disappear. Much like the “Add your first project” card in an empty dashboard, these markers tell every newcomer, “Here’s where to start making an impact, and here’s when this chunk of work should gracefully sunset.”</p>
]]></content:encoded></item><item><title><![CDATA[Pandas vs. PySpark: When Bigger Isn’t Always Better]]></title><description><![CDATA[Introduction
In the data engineering world, bigger usually means better. More power, more scalability, and more buzzwords! PySpark, for instance, is typically our trusty steed when tackling massive da]]></description><link>https://blog.datachef.co/pandas-vs-pyspark-when-bigger-isnt-always-better</link><guid isPermaLink="true">https://blog.datachef.co/pandas-vs-pyspark-when-bigger-isnt-always-better</guid><category><![CDATA[spark]]></category><category><![CDATA[data-engineering]]></category><category><![CDATA[pandas]]></category><dc:creator><![CDATA[Soheil Sheybani]]></dc:creator><pubDate>Wed, 26 Mar 2025 15:23:11 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1742904506527/2b19daa5-e500-453c-b343-14fbaed0109a.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Introduction</h2>
<p>In the data engineering world, bigger usually means better. More power, more scalability, and more buzzwords! PySpark, for instance, is typically our trusty steed when tackling massive datasets on distributed clusters like Databricks. But what if, in our race to scale everything to infinity, we've overlooked simpler, faster solutions?</p>
<p>Here's a story where simplicity won big—and PySpark learned a lesson in humility.</p>
<hr />
<h2>The Great Excel Report Battle</h2>
<p>I was tasked with creating an Excel report based on a moderately sized dataset—around 100,000 rows. My initial weapon of choice? PySpark DataFrames running in Databricks. After all, PySpark thrives on handling large datasets, and our environment was already set up. Easy choice, right?</p>
<p>Well, not exactly.</p>
<p>The report involved filtering data by certificate types, applying various transformations and sorting criteria, and finally writing each filtered set to separate Excel sheets (132 sheets in total). Here's a glimpse of the PySpark code I confidently started with:</p>
<pre><code class="language-python">def write_dataframe_to_excel(writer, df, certificate):
    df = df.withColumn(
        "Status Priority",
        F.when(F.col("Status") == "Non-Comply", 0)
        .when(F.col("Status") == "Warning", 1)
        .when(F.col("Status") == "In-Future", 2)
        .otherwise(3)
    ).withColumn(
        "Sorted Expiry Date",
        F.coalesce(
            F.to_date(F.col("Certificate Expiry Date"), "dd/MM/yyyy"),
            F.to_date(F.lit("01/01/1900"), "dd/MM/yyyy")
        )
    ).orderBy("Status Priority", "Sorted Expiry Date")
    df = df.drop("certificate_code", "certificate_name", "Compliance Status", "Status Priority", "Sorted Expiry Date")
    df.toPandas().to_excel(writer, index=False, sheet_name=certificate, startrow=2)

with pd.ExcelWriter(excel_path, engine="openpyxl") as writer:
    for certificate in main_certificate_codes:
        df = compliance_df.filter(F.col("certificate_code") == certificate)
        if certificate in ["BOSIET", "HUET"]:
            df = add_caebs_columns(df)
        df = df.filter(F.col("Compliance Status").isin(["Non-Comply", "Warning", "In-Future"]))
        write_dataframe_to_excel(writer, df, certificate)
</code></pre>
<p>Initially, it looked perfect. But execution took nearly 30 minutes! How was PySpark, the champion of big data, struggling with a modest Excel report?</p>
<h2>Enter Pandas: The David to PySpark’s Goliath</h2>
<p>I started suspecting that I might be over-engineering my solution. After all, was distributed computing even necessary for a dataset of just 100,000 rows?</p>
<p>So, I decided to rewrite the logic using Pandas DataFrames, bypassing PySpark’s overhead entirely. The results were nothing short of astonishing:</p>
<pre><code class="language-python">def write_pd_dataframe_to_excel(writer, df, certificate):
    df["Status Priority"] = df["Status"].map({
        "Non-Comply": 0,
        "Warning": 1,
        "In-Future": 2
    }).fillna(3)
    df["Sorted Expiry Date"] = pd.to_datetime(
        df["Certificate Expiry Date"], 
        format="%d/%m/%Y", 
        errors="coerce"
    ).fillna(pd.to_datetime("1900-01-01"))
    df = df.sort_values(by=["Status Priority", "Sorted Expiry Date"])
    df.drop(columns=[
        "certificate_code", "certificate_name", "Compliance Status", "Status Priority", "Sorted Expiry Date"
    ], inplace=True)
    df.to_excel(writer, index=False, sheet_name=certificate, startrow=2)

compliance_df = add_caebs_columns(compliance_df)
compliance_df = compliance_df.filter(
    F.col("Compliance Status").isin(["Non-Comply", "Warning", "In-Future"])
)
pd_df = compliance_df.toPandas()
pd_df["PIN"] = pd.to_numeric(pd_df["PIN"], errors="coerce")
with pd.ExcelWriter(excel_path, engine="openpyxl") as writer:
    for certificate in main_certificate_codes:
        df = pd_df[pd_df["certificate_code"] == certificate]
        if certificate not in ["BOSIET", "HUET"]:
            df = df.drop(columns=["CAEBS Issue Date", "CAEBS Expiry Date"])
        write_pd_dataframe_to_excel(writer, df, certificate)
</code></pre>
<p>This Pandas solution reduced execution time from 30 minutes to just 2 minutes! No distributed cluster, no overhead, just straightforward Python and Pandas magic.</p>
<hr />
<h2>Why Pandas Crushed PySpark in This Case</h2>
<p>Why did Pandas outperform PySpark by such a massive margin? The key lies in the nature of the data and the overhead associated with distributed computing.</p>
<ul>
<li><p><strong>Overhead of Distributed Processing</strong>: PySpark is designed for distributed computing across multiple nodes, involving tasks like data serialization/deserialization, task distribution, and network communication. For smaller datasets like ours, this overhead becomes a significant burden, drastically affecting performance.</p>
</li>
<li><p><strong>Absence of JVM Overhead</strong>: PySpark runs on the Java Virtual Machine (JVM), which introduces additional overhead and complexity. Pandas, being purely Python-based, avoids this altogether.</p>
</li>
<li><p><strong>In-Memory Operations</strong>: Pandas operates entirely within memory on a single node, making computations significantly faster.</p>
</li>
<li><p><strong>Data Locality</strong>: Pandas capitalizes on data locality, ensuring minimal data movement and latency, enhancing processing speed significantly compared to PySpark's distributed nature.</p>
</li>
</ul>
<hr />
<h2>Has Pandas ‌Been Abused In Data Engineering?</h2>
<p>Well No! However, Pandas won the game in this scenario by margin, it isn't always the optimal solution. Pandas have some breaking points that might make to to think of other game players like Polar or PySpark:</p>
<ul>
<li><strong>Memory Constraints</strong>:  Pandas loads all data into RAM. If your dataset size exceeds available memory, Pandas can break or severely slow down performance.</li>
<li><strong>Complex Joins on Large Data</strong>: Performing complex joins or operations on large datasets might exceed Pandas' capabilities, causing performance degradation.</li>
<li><strong>Parallel and Distributed Processing</strong>: Pandas is single-threaded, and If parallel and distributed computation is necessary (like ETL on massive datasets), Pandas doesn't scale well.</li>
<li><strong>Large-scale I/O Operations</strong>: Pandas struggles with massive read/write operations, particularly from distributed sources or formats like Parquet, Avro, or ORC.</li>
<li><strong>Real-time Data Processing</strong>: Pandas is not optimized for streaming or real-time processing, where PySpark excels.</li>
<li><strong>Columnar Optimization</strong>: Unlike Polars or PySpark, Pandas doesn't have built-in support for optimized columnar operations, limiting speed in certain operations. If you have lots of columnar operations on a large dataset, Pandas is not a good choice.</li>
</ul>
<hr />
<h2>Lessons Learned: Choose Your Tools Wisely</h2>
<p>This isn’t a "Pandas always beats PySpark on small datasets" scenario—it’s about picking the right tool for the job. Before going for a tool, you need to consider these factors holistically:</p>
<ul>
<li><strong>Data Volume</strong>: Does your dataset comfortably fit in RAM?</li>
<li><strong>Computation Type</strong>: Are you performing complex transformations or simple operations?</li>
<li><strong>Latency Requirements</strong>: Is real-time or near-real-time performance critical?</li>
<li><strong>Infrastructure Available</strong>: Do you have access to distributed computing resources?</li>
<li><strong>Complexity of Joins and Aggregations</strong>: Are complex operations frequent?</li>
</ul>
<p>Use Pandas for quick, ad-hoc tasks on small-to-medium datasets. Opt for Polars for medium-large datasets on a single powerful machine. Choose PySpark for distributed computing, large datasets, or real-time analytics.</p>
<p>Below is a quick comparison table:</p>
<table>
<thead>
<tr>
<th>Feature</th>
<th>Pandas</th>
<th>PySpark</th>
<th>Polars</th>
</tr>
</thead>
<tbody><tr>
<td>Memory Efficiency</td>
<td>Moderate</td>
<td>High</td>
<td>Very High</td>
</tr>
<tr>
<td>Parallelism</td>
<td>No</td>
<td>Distributed</td>
<td>Multi-threaded</td>
</tr>
<tr>
<td>Dataset Size Handling</td>
<td>Small-Medium</td>
<td>Large-Very Large</td>
<td>Medium-Large</td>
</tr>
<tr>
<td>I/O Performance</td>
<td>Moderate</td>
<td>High</td>
<td>High</td>
</tr>
<tr>
<td>Real-Time Processing</td>
<td>Poor</td>
<td>Excellent</td>
<td>Moderate</td>
</tr>
<tr>
<td>Ease of Use</td>
<td>Excellent</td>
<td>Moderate</td>
<td>Good</td>
</tr>
<tr>
<td>Columnar Optimization</td>
<td>Limited</td>
<td>Good</td>
<td>Excellent</td>
</tr>
<tr>
<td>So, next time you're faced with a data engineering task, pause before automatically reaching for the biggest tool in your toolkit. Sometimes, the simplest approach is not just sufficient—it's superior!</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody></table>
]]></content:encoded></item><item><title><![CDATA[Data Set or Data Product, That is the Question]]></title><description><![CDATA[Data is one of the most valuable assets an organization has, yet it is often treated as an IT byproduct rather than a strategic asset. Many companies collect a vast amount of data, store it in isolated systems, and expect that insights will somehow e...]]></description><link>https://blog.datachef.co/data-set-data-product-management</link><guid isPermaLink="true">https://blog.datachef.co/data-set-data-product-management</guid><category><![CDATA[data management]]></category><category><![CDATA[Data Mesh]]></category><category><![CDATA[data]]></category><category><![CDATA[Data Products]]></category><category><![CDATA[analytics]]></category><category><![CDATA[dataset]]></category><dc:creator><![CDATA[Davide Rovati]]></dc:creator><pubDate>Thu, 13 Feb 2025 09:08:30 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/hpjSkU2UYSU/upload/de37295bb9824cb72fe5fd6ced981616.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Data is one of the most valuable assets an organization has, yet it is often treated as an <strong>IT byproduct</strong> rather than a <strong>strategic asset</strong>. Many companies collect a vast amount of data, store it in isolated systems, and expect that insights will somehow emerge from these raw numbers.</p>
<p>Simply storing data isn’t enough. To truly unlock its value, organizations need to treat <strong>data as a product</strong>: something that is deliberately designed and made usable for specific business needs.</p>
<p>At first glance, this may sound like additional overhead. Another layer of governance? That’s going to slow our initiatives down! But in reality, the opposite is true. Consider the amount of time spent questioning data sets when they are dumped in a centralized system. How many times have you asked yourself questions like: Is it really the data set we need? Oh wait, this column actually means something else in our data model. The data in this table seems incomplete. I can’t understand what it represents. The schema changed again, now our system is broken in production…</p>
<p>Modern frameworks have proposed alternative ways of managing data sets. But it all stems from a different way of <strong>thinking</strong> about your data. In this post, we’ll break down the key concepts behind <strong>data as a product</strong>, the difference between <strong>data sets and data products</strong>, and how this shift improves <strong>usability, accessibility, and governance</strong>.</p>
<hr />
<h2 id="heading-understanding-the-basics-data-data-sets-and-products"><strong>Understanding the Basics: Data, Data Sets, and Products</strong></h2>
<p>Before diving into <strong>data as a product</strong>, let’s clarify some fundamental terms:</p>
<ul>
<li><p><strong>Data:</strong> The raw facts and figures collected by an organization—numbers, text, timestamps, sensor readings, etc.</p>
</li>
<li><p><strong>Data Set:</strong> A structured collection of related data points, typically stored in formats like <strong>tables, CSV files, or databases</strong>. Examples include a table of <strong>customer transactions, machine sensor logs, or financial records</strong>.</p>
</li>
<li><p><strong>Product:</strong> Something deliberately designed and built to provide value to a user, whether physical (a phone) or digital (a mobile app).</p>
</li>
<li><p><strong>Data Product:</strong> A combination of <strong>data set(s), domain model, and user experience,</strong> <strong>designed for a specific use case</strong>. Examples include a fraud detection model based on transaction data, a self-service analytics dashboard for sales teams, or an API that provides real-time customer insights.</p>
</li>
</ul>
<hr />
<h2 id="heading-data-set-vs-data-product-whats-the-difference"><strong>Data Set vs. Data Product: What’s the Difference?</strong></h2>
<p>Let’s start with an example and consider the sales data in an e-commerce business.</p>
<ul>
<li><p>A <strong>data set</strong> might contain <strong>raw customer transactions</strong> with thousands of line items. Without context, a user might struggle to extract insights. The transactions don’t serve any purpose by themselves: they are simply a collection of information.</p>
</li>
<li><p>A <strong>data product</strong> could be a <strong>"Monthly Sales Performance Dashboard"</strong>, which combines one or more data sets (the raw transactions), the domain model (aggregation into revenue trends, regional breakdowns, …) and the user experience (the graphic UI of the dashboard, its access policies, the ability to drill down in a report, …).</p>
</li>
</ul>
<p>A data set is a fundamental component of a data product. But a data product is much more than that. It must include:</p>
<p>✅ <strong>Defined ownership</strong> – Who is responsible for maintaining the data?</p>
<p>✅ <strong>Clear documentation</strong> – What does the data mean? How should it be used?</p>
<p>✅ <strong>Usability</strong> – Is the data structured in a way that users can access and understand?</p>
<p>✅ <strong>Ongoing updates &amp; maintenance</strong> – Is the data kept fresh and accurate?</p>
<h3 id="heading-business-domains-define-data-products"><strong>Business Domains Define Data Products</strong></h3>
<p>Organizations don’t need just data. They need <strong>data that serves a purpose</strong>. That’s why defining data products should always start from the <strong>business domain</strong>:</p>
<p>🔹 <strong>Finance teams</strong> may need a <strong>revenue forecasting data product</strong> based on historical transactions.</p>
<p>🔹 <strong>Marketing teams</strong> may need a <strong>customer segmentation data product</strong> that enriches demographic data with purchase behavior.</p>
<p>🔹 <strong>Operations teams</strong> may need a <strong>real-time logistics dashboard</strong> that tracks shipment statuses and delays.</p>
<p>By starting from business domain needs, companies can design data products that are immediately useful, rather than dumping raw data into a central repository and expecting teams to figure it out on their own.</p>
<hr />
<h2 id="heading-what-does-it-mean-to-treat-data-as-a-product"><strong>What Does It Mean to Treat Data as a Product?</strong></h2>
<p>A data product must be <strong>designed, maintained, improved, and eventually retired when no longer needed</strong>. This approach ensures that data remains valuable and does not become obsolete. The whole lifecycle should revolve around the following key principles that augment the world of “data sets” by introducing product thinking as a new dimension.</p>
<h3 id="heading-usability-amp-customer-centricity"><strong>Usability &amp; Customer-Centricity</strong></h3>
<p>🔹 <strong>Who needs this data? How will they use it?</strong></p>
<p>A well-designed data product is intuitive and built for the end user. This means:</p>
<p>✅ <strong>Well-documented definitions</strong> so users understand the data.</p>
<p>✅ <strong>Consistent structure and formatting</strong> to avoid confusion.</p>
<p>✅ <strong>Version-controlled updates</strong> to prevent disruptions in workflows.</p>
<p>Unlike traditional data management, which prioritizes storage and availability, a product mindset ensures that <strong>data is packaged for real-world usage</strong>, just like a consumer-facing app or tool.</p>
<h3 id="heading-accessibility"><strong>Accessibility</strong></h3>
<p>🔹 <strong>How easily can users find and interact with the data?</strong></p>
<p>Many organizations struggle with <strong>data locked away in silos</strong>, making it difficult for teams to access and use. A <strong>data product should be discoverable and well-integrated</strong>, for example it should have:</p>
<p>✅ <strong>Self-service data catalogs</strong> to remove unnecessary friction between teams and questions such as “Could you tell me which data sets are available?”</p>
<p>✅ <strong>Role-based permissions</strong> to manage security and compliance.</p>
<p>✅ <strong>Standardized formats</strong> to allow smooth integration with other tools and platforms.</p>
<p>By ensuring <strong>discoverability and controlled access</strong>, data becomes a <strong>trusted, reusable resource</strong> instead of a hidden asset that requires manual extraction.</p>
<hr />
<h2 id="heading-pragmatic-governance-with-data-as-a-product"><strong>Pragmatic Governance with Data as a Product</strong></h2>
<p>One of the biggest challenges with traditional data management is poor governance—unclear ownership, inconsistencies, and compliance risks. Treating data as a product <strong>solves many governance issues by design</strong>.</p>
<h3 id="heading-1-clear-ownership-amp-accountability"><strong>1. Clear Ownership &amp; Accountability</strong></h3>
<ul>
<li><p>Every data product has a <strong>designated owner</strong> (often a business or data team) responsible for its accuracy and updates.</p>
</li>
<li><p>Unlike traditional IT-driven data models, ownership is <strong>distributed across business domains</strong>. This ensures, for example, that data issues are addressed by people who have a deep understanding of the business processes that generate the data itself.</p>
</li>
</ul>
<h3 id="heading-2-built-in-data-quality"><strong>2. Built-in Data Quality</strong></h3>
<ul>
<li><p>Errors and inconsistencies are addressed <strong>before reaching users</strong>, thanks to holistic observability.</p>
</li>
<li><p><strong>Data contracts</strong> ensure that data sources follow a standard schema and remain compatible across different systems.</p>
</li>
</ul>
<h3 id="heading-3-stronger-security"><strong>3. Stronger Security</strong></h3>
<ul>
<li><p><strong>Access controls</strong> are built into data products, preventing unauthorized access while facilitating access for authorized users.</p>
</li>
<li><p>Automated <strong>audit logs</strong> track data usage, making compliance with GDPR, CCPA, and other regulations easier.</p>
</li>
</ul>
<p>📌 <strong>According to a 2024 Gartner survey (Evolution of Data Management), the top investment trends for the next three years are AI-ready Data Initiatives and Data quality &amp; Governance</strong>. Treating data as a product supports both: AI models require structured, well-maintained data, and governance naturally improves when ownership and usability are prioritized.</p>
<hr />
<h2 id="heading-data-as-a-product-is-a-transformation-not-just-a-definition"><strong>Data as a Product is a Transformation, Not Just a Definition</strong></h2>
<p>Shifting to data as a product is not just a matter of changing terminology—it requires a fundamental transformation in how an organization <strong>creates, manages, and uses data</strong>. This shift impacts business processes, technology, governance, and culture.</p>
<p>The transition can be challenging, especially in organizations with legacy systems and siloed data ownership. More importantly, the biggest hurdle is a <strong>culture</strong> that treats data as an IT responsibility rather than a business enabler.</p>
<p>If you're looking to <strong>adopt a data-as-a-product approach</strong> and need guidance on where to start, <strong>get in touch</strong>. We can help you design a strategy that fits your business and unlocks the full value of your data. 🚀</p>
<p>Here’s a <a target="_blank" href="https://blog.datachef.co/data-as-a-product-data-mesh-team-topologies">story on the transformative journey of Data as a Product</a> at one of our past clients.</p>
]]></content:encoded></item><item><title><![CDATA[Improving Information Retrieval with Knowledge Graphs: Comparing VectorDB RAG vs Graph-Powered RAG on AWS]]></title><description><![CDATA[Introduction
Large Language Models (LLMs) have become ubiquitous, but their tendency to “hallucinate” non-factual details remains a challenge. Retrieval Augmented Generation (RAG) was introduced to ground LLM responses in actual data by retrieving re...]]></description><link>https://blog.datachef.co/improving-information-retrieval-with-knowledge-graphs-comparing-vectordb-rag-vs-graph-powered-rag-on-aws</link><guid isPermaLink="true">https://blog.datachef.co/improving-information-retrieval-with-knowledge-graphs-comparing-vectordb-rag-vs-graph-powered-rag-on-aws</guid><category><![CDATA[RAG ]]></category><category><![CDATA[knowledge graph]]></category><category><![CDATA[llm]]></category><dc:creator><![CDATA[Ali Yazdizadeh]]></dc:creator><pubDate>Mon, 10 Feb 2025 15:48:06 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1739217965006/9ea94265-4852-442d-b55d-525e43964513.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction"><strong>Introduction</strong></h2>
<p>Large Language Models (LLMs) have become ubiquitous, but their tendency to “hallucinate” non-factual details remains a challenge. Retrieval Augmented Generation (RAG) was introduced to ground LLM responses in actual data by retrieving relevant facts and incorporating them into prompts. However, when your data is highly interconnected—such as in codebases or scientific documents which call and cite other parts of the data respectively —traditional vector-based search can miss key relationships.</p>
<p>In this post, we’ll explore how integrating Knowledge Graphs into the RAG pipeline can boost the relevance and factual accuracy of the output. We’ll compare a conventional VectorDB RAG (using <a target="_blank" href="https://aws.amazon.com/bedrock/">Amazon Bedrock</a> with <a target="_blank" href="https://aws.amazon.com/opensearch-service/">OpenSearch</a>) with a Graph-Powered RAG (using Amazon Bedrock alongside a graph database like <a target="_blank" href="https://aws.amazon.com/neptune/">Amazon Neptune</a> or <a target="_blank" href="https://neo4j.com/">Neo4j</a>).</p>
<p><strong>Summary YouTube Video:</strong></p>
<ul>
<li><strong>Will be Published Soon!</strong></li>
</ul>
<h2 id="heading-problem-statement"><strong>Problem Statement</strong></h2>
<p>While RAG helps mitigate LLM hallucinations by providing factual context, its typical implementation using vector similarity search has a critical flaw: It looks for similar chunks of document to add to the context and <strong>“similar” does not always mean “relevant”!</strong> For instance, when asking, “What are the possible causes of obesity?” a vector search might return descriptive text about obesity rather than pinpointing factors like “hypothyroidism” or “bad diet”—the actual causes.</p>
<p>This issue was tried to solve using various techniques, most notably ReRanking of the RAG result using another LLM based on relevancy. But one solution we are interested in is using Knowledge Graphs. We wrote a series of blog posts on Knowledge Graphs: <a target="_blank" href="https://blog.datachef.co/knowledge-graphs-1">Our Blog Series</a>. Here we want to see if adding them to the LLM pipeline can help the performance of the generation.</p>
<p><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXeCySUxpma0x7dspuy9harOZ9mi2KY9t2eIiF0BcY6XcyPomBmjcyO3gDBJrKVQmsyV0QYVSVFeAbntZ7RzAa---fAwDlitJZXLjifLigJtrDsG2RY0hD30xi_Rt3JZJLCE0izXiQ?key=WN8P_mxwsUWTSOTkoKG1EeBW" alt /></p>
<p><strong><em>Fig 1. Knowledge Graph-Powered RAG vs Vector RAG</em></strong></p>
<h2 id="heading-use-case-example-code-refactoring"><strong>Use Case Example: Code Refactoring</strong></h2>
<p>Consider a codebase, here the <a target="_blank" href="https://github.com/psf/requests">Python Requests Library</a>, where functions are scattered across multiple files and are highly interconnected. Suppose you need to refactor a function (say, utils.urldefragauth), but you’re unsure which other functions depend on it. A simple vector search might return the function definition only. However, a Knowledge Graph that maps function dependencies can retrieve all related functions—giving you the necessary context to ensure a safe refactor.</p>
<h2 id="heading-knowledge-graph-powered-rag-how-does-it-work"><strong>Knowledge Graph-Powered RAG: How Does It Work?</strong></h2>
<p>Traditional RAG uses vector embeddings to find similar text chunks. In contrast, <strong>Knowledge</strong> <strong>Graph-Powered RAG</strong> leverages a Knowledge Graph to capture explicit relationships among data points (e.g., function calls in code). This approach uses a query language—such as Cypher—to ask precise questions about how entities are connected.</p>
<p>For example, a valid Cypher query to find which functions call a particular function might be:</p>
<pre><code class="lang-sql">MATCH (caller:Function)-[:CALLS]-&gt;(target:Function {name: 'utils.urldefragauth'})

RETURN caller.name AS CallingFunction, caller.code AS CallingFunctionCode

LIMIT 5;
</code></pre>
<p>Here I used the <a target="_blank" href="https://www.llamaindex.ai/">LlamaIndex</a> package which provides some functionality including the <a target="_blank" href="https://docs.llamaindex.ai/en/stable/module_guides/indexing/lpg_index_guide/#texttocypherretriever">TextToCypherRetriever</a> which does the heavy job of turning a text query into a Cypher query compatible with your knowledge graph. What happens underneath it though is basically begging an LLM to turn the query and knowledge graph schema into a valid Cypher query!</p>
<p><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXc2EOn5rBk923b6Nq0UtJyt06J8yJ3kDVkrK6eeARXySE_waoULUmnIDrKOvgUiLcVjQpR7r_WLXpn_5Tz0q4aqvfHj_g08BygBJfMrv4N1z7wtg74B7l7T340Pl5AZ1o45MVmd?key=WN8P_mxwsUWTSOTkoKG1EeBW" alt /></p>
<h2 id="heading-how-to-prepare-the-code-graph"><strong>How to Prepare the Code Graph?</strong></h2>
<p>Code repositories are a perfect example of highly interconnected data. Modules import functions from various other modules and are themselves called by various other functions.</p>
<p>A crucial step in our workflow is preparing a fact-based Knowledge Graph—in this case, a graph of functions and their relationships, specifically which function calls which. Here, I used the Python ast library to parse the Python files of the popular package requests.</p>
<p>Here are some code snippets:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">parse_python_file</span>(<span class="hljs-params">file_path</span>):</span>

    <span class="hljs-string">"""Parse a Python file to extract functions, classes, dependencies, imports, and code."""</span>

    <span class="hljs-keyword">with</span> open(file_path, <span class="hljs-string">"r"</span>) <span class="hljs-keyword">as</span> f:

        source = f.read()

        source_lines = source.split(<span class="hljs-string">'\n'</span>)

        tree = ast.parse(source, filename=file_path)


    visitor = FunctionVisitor(source_lines)

    visitor.visit(tree)

    <span class="hljs-keyword">return</span> visitor.functions, visitor.imports, visitor.classes
</code></pre>
<p>Below is a part of the final graph (stored in Neo4j AuraDB):</p>
<p><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXei7x8Mg-qxnfmyZxCWeFpAkvvVYF9S2qhQ2S3o_JCd7dr6XJbh_RiSj3vEImcgne74XgfBcnZIYI2YT6VEVGjcRzj_KRIuDEUUcG3VBOOW-zmVmN-TCzbSTMsbwNh9KUMt5sYHxg?key=WN8P_mxwsUWTSOTkoKG1EeBW" alt /></p>
<h2 id="heading-graph-powered-rag-vs-vector-database-rag-example"><strong>Graph-Powered RAG vs Vector Database RAG: Example</strong></h2>
<p>To show the difference between a vector database approach and a knowledge graph approach, I set up an AWS Bedrock Knowledge Base, which uses OpenSearch Service for vector search. So I will compare its results with the Knowledge Graph hosted on a Neo4j database (alternatively, I could use an Amazon Neptune database for Graph with some small changes in the code!).</p>
<p>I asked both of these workflows (or you can call them AI Agents!) what happens if I change a function? This question clearly shows the difference in the graph approach, since to answer, you need to know which other functions call this function, which can be in any other files!</p>
<p>Here is the input to both of these agents:</p>
<blockquote>
<p>I want to refactor the utils.urldefragauth function. What other functions call it and might get affected? Show the CODE to those function.</p>
</blockquote>
<h2 id="heading-graph-powered-rag-answer-generate-using-amazon-bedrock"><strong>Graph-Powered RAG Answer (Generate using Amazon Bedrock):</strong></h2>
<p>Based on the provided information, the function that calls <code>utils.urldefragauth</code> is <code>adapters.HTTPAdapter.request_url</code>. This function might be affected if you refactor <code>utils.urldefragauth</code>. Here's the code for <code>adapters.HTTPAdapter.request_url</code>:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">request_url</span>(<span class="hljs-params">self, request, proxies</span>):</span>

    <span class="hljs-string">"""Obtain the url to use when making the final request.
    If the message is being sent through a HTTP proxy, the full URL has to
    be used. Otherwise, we should only use the path portion of the URL.
    This should not be called from user code, and is only exposed for use
    when subclassing the
    :class:`HTTPAdapter &lt;requests.adapters.HTTPAdapter&gt;`.
    :param request: The :class:`PreparedRequest &lt;PreparedRequest&gt;` being sent.
    :param proxies: A dictionary of schemes or schemes and hosts to proxy URLs.
    :rtype: str
    """</span>
    proxy = select_proxy(request.url, proxies)
    scheme = urlparse(request.url).scheme
    is_proxied_http_request = proxy <span class="hljs-keyword">and</span> scheme != <span class="hljs-string">"https"</span>
    using_socks_proxy = <span class="hljs-literal">False</span>

    <span class="hljs-keyword">if</span> proxy:
        proxy_scheme = urlparse(proxy).scheme.lower()
        using_socks_proxy = proxy_scheme.startswith(<span class="hljs-string">"socks"</span>)

    url = request.path_url
    <span class="hljs-keyword">if</span> url.startswith(<span class="hljs-string">"//"</span>):  <span class="hljs-comment"># Don't confuse urllib3</span>
        url = <span class="hljs-string">f"/<span class="hljs-subst">{url.lstrip(<span class="hljs-string">'/'</span>)}</span>"</span>

    <span class="hljs-keyword">if</span> is_proxied_http_request <span class="hljs-keyword">and</span> <span class="hljs-keyword">not</span> using_socks_proxy:
        url = urldefragauth(request.url)

    <span class="hljs-keyword">return</span> url
</code></pre>
<p>This function uses <code>urldefragauth</code> when dealing with proxied HTTP requests that are not using SOCKS proxy. Any changes to the behavior or signature of <code>utils.urldefragauth</code> could potentially impact this function, so you'll need to ensure that your refactoring doesn't break the existing functionality or interface expected by <code>adapters.HTTPAdapter.request_url</code> .</p>
<h2 id="heading-vector-database-rag-answer-generate-using-amazon-bedrock"><strong>Vector Database RAG Answer (Generate using Amazon Bedrock):</strong></h2>
<p>I apologize, but I cannot provide an answer to your question based on the given context information. The provided code snippet does not contain any reference to a <code>utils.urldefragauth</code> function or any functions that call it. The code appears to be a collection of utility functions related to networking, proxies, and HTTP requests, but it doesn't include the specific function you're asking about or any functions that call it. Without more relevant context about the <code>utils.urldefragauth</code> function, I'm unable to identify the functions that might be affected by refactoring it or show you their code.</p>
<h2 id="heading-comparison">Comparison:</h2>
<p>As you can see since the Vector database is unable to see the called functions relationship it was unable to answer the question unlike the Knowledge Graph approach which successfully found the relationship and answered the questions!</p>
<h2 id="heading-table-summary"><strong>Table Summary:</strong></h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Approach</strong></td><td><strong>Pros</strong></td><td><strong>Cons</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>VectorDB RAG</strong></td><td>Simple setup; fast semantic search</td><td>May miss inter-entity relationships; less context-aware</td></tr>
<tr>
<td><strong>Graph-Powered RAG</strong></td><td>Captures complex relationships; provides contextual insights</td><td>Requires additional overhead; more complex to maintain; higher cost</td></tr>
</tbody>
</table>
</div><h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>For applications where factual accuracy and interconnected data are critical, a Graph-Powered RAG approach shows considerable promise. Although it demands a higher upfront investment in terms of graph preparation and system complexity, the improved contextual accuracy can be invaluable—especially for domains like code refactoring or multi-entity document retrieval.</p>
<p>For those interested in experimenting further, the full code repository is available here:<br /><a target="_blank" href="https://github.com/DataChefHQ/BlogProjects/tree/main/CodeGraphRAG">GitHub Repository: CodeGraphRAG</a></p>
]]></content:encoded></item></channel></rss>