Are You Data Economy Ready? From Thinking to Doing: Building Data Products
Talk of data products has taken the data world by storm. And I love it. It’s the use of data that delivers value. Data as a product puts that into practice.
We often hear the term “data products” as one of the four pillars of a data mesh. The data mesh paradigm embraces the principle of data as a product. Thinking of data as a product focuses on data’s use, and the value that it can deliver to the business, and guides the output of the data domain owner. Data owners within the organization are accountable to their eventual consumers—the business users champing at the bit for data insights—and must create products that fit their requirements. Thank you, data mesh evangelists, for increasing awareness, but let’s not limit adoption of the principle. Data product thinking is for everyone.
Yet, the notion remains a mystery for many or even a point of contention for some. What constitutes a data product?
Discovering what the product should be
A data product is a reusable building block built to deliver data, or insights from that data, for a specific purpose—in other words, use. A data product is designed for use, as opposed to an asset which might only be stored and protected. The process of data product thinking, therefore, should start by working backward from the customer or the specific use case, from the outside in. Borrowing from standard product development frameworks, the process starts with “discovery.” Requirements are ultimately delivered to product teams who build and deliver the product.
Some basic questions include the following:
- Who are the customers?
- What is the use case?
- What is the customer trying to achieve?
- What are their requirements?
- Is there a minimum viable product?
- What data is needed for that product?
There are a number of tools and frameworks associated with product discovery: customer interviews, customer journey mapping, demand tests, A/B testing, and more. But, ultimately, it’s about understanding customer needs and building a product that is fit-for-purpose.
In practice, however, it’s often the data teams who first begin the process as they have more data knowledge than their business partners. A data team might begin with an idea of how their data could benefit the business. They then socialize the idea and a prototype with their counterparts. From that point, the process could follow a classic discovery methodology to refine the product requirements.
Types of data products
In a previous blog post, I discussed different forms a data product can take using the sand, glass, or lamp metaphor, depending on the consumer of the data and the use case. Some data products will be components in a more complex analysis or context-specific business application. Those intermediate components must still be treated as a product because the final data product output is dependent on them. Quality sand makes good glass, or with the right minerals, it makes crystal. The process of working backward uncovers all data products along the value chain and their requirements.
Data must be thought of as a product along its entire value chain from source to consumer.
- Source-aligned data products. The data in these products are from an operational system, like a CRM or an ERP application, streamed from an IoT device, or captured from web logs. These are the building blocks. As Sherlock Holmes says, “Data, data, data! I can’t make bricks without clay.” And these source-aligned products are the clay. Ownership of these data products lies within the domain that produces the data, the source domain. But the product owners are responsible for ensuring that their data products meet the needs of downstream consumers. The figure below illustrates how these data products might be used, either as a single source product or combined as a component in a more complex data product. Regardless, these owners must think of their data as a product.
- Consumer-aligned data products. These products can take different forms, transforming source-aligned data products to meet the needs of a particular consumer or business need. Product teams within the consumer domain build these products based on their expertise and understanding of the business context. A developer or data scientist may want the data itself, transforming it to train a model, as in the diagram below. A business analyst might need only transactions from a specific region or for a specific type of sale, a subset of the source. Note the smaller block below. In other cases, a data application would deliver insights from multiple data sources directly into a business context. For example, a sales manager might want a lead scoring application, a product manager might want a price optimization tool, or the “consumer” might be non-human with the application performing automated anomaly detection of a connected device and triggering a notification. These aggregate data products combine multiple building blocks.
Some also talk about aggregate data products. As the name suggests, these products aggregate multiple data products, often an intermediate step between multiple sources and the consumer. For example, a data enrichment service pulls attributes from several sources. Ultimately, many data products may be combined to create an end-product that addresses broad functional or enterprise-level objectives, like a customer 360, ESG reporting, or any other type of executive dashboard, pulling data from multiple sources or business units. These data products are cross-domain or business units and require significant coordination.
Each type of data product must be built for consumption downstream and take into account the needs of the consumers. Gathering those requirements is the role of the product team, and the product manager in particular—whether it’s a source-aligned or consumer-aligned data product.
When data products have overlapping requirements, the product team deconstructs the end-product to identify common components. These components are like LEGO pieces that can be aggregated into multiple end-products. For example, a customer 360 and a product 360 might both include data on product usage. The product team would be tasked with reconciling requirements as much as possible to ensure reuse of the product usage data product.
An enterprise-wide Data Product Council or Steering Committee coordinates requirements across data products.
Introducing the product team
Most data leaders advocate having a cross-functional data product team, incorporating the roles needed to bring a product from conception to production, rather than a series of function-specific teams to build a data product in a serial process. There is a wide range of potential roles—some have more of a business focus, like the business analysts or data scientists, and others are more on the IT side, but the two must come together. David Dadoun, CDO at Bombardier Recreational Products (BRP) described in a recent session at Data Innovation Summit in Stockholm that these data and analytics teams or “DnA teams” are a double helix. The business and IT strands fit perfectly together, with the data product roles bridging the two.
The following table provides descriptions of possible roles in a data product team:
Not all teams include all roles and not all data teams are alike. At the software company Flexport, the domain data products are developed as part of the software development process. The software product manager takes responsibility for the data products associated with the software products. Responsibilities of the software engineering manager and engineers expand likewise. An analytics engineer is added to the software product team to design and govern the data products and work closely with the relevant data stewards.
In addition to the formal job description and required skills, data product teams need certain characteristics. Omar Khawaja, Head Business Intelligence at Roche, described in his recent blog how data product teams are not feature teams taking orders or waiting for new requirements; they are more proactive. They must be empowered but also self-motivated to guide product development from conception to deployment, balancing complex contexts and end users. For Omar, data products are the heart of the data mesh. The other three principles of the data mesh are represented by the soul (domain ownership), the body (self-service infrastructure), and the mind (federated governance). But it’s not just for a data mesh; data product teams are the heart of any data-driven organization.
A few of the things data product teams must be able to do, highlighted by Omar, include the following:
- Handle diverse source systems, learning new technologies or techniques when necessary
- Understand the business context and the end user to ensure the appropriate business logic and a positive user experience
- Drive continuous discovery for building and testing new products
- Anticipate needs that business users might not yet be able to articulate
- Adhere rigorously to compliance policies, finding solutions within them
Embedding data product thinking
It is essential for the organization as a whole to embrace the notion of data products. Consumers should learn to articulate their needs; data producers must deliver product-ready data. Data product teams within the source domains need the technology and data skills required to build data products, and either understand the context in which the product will be used or work closely with experts who do.
Developing this thinking requires organizational and behavioral changes—a focus on the people and processes within the organization. New goals must be established to drive the desired behavior, with new measures of success more focused on the data customer, the use of the data, and the value delivered to the organization.
My previous blog post, Address Organizational Issues When Weaving the Data Mesh, elaborates on required investments:
- Enablement—How do I do it?
- Incentives—Why should I do it?
- Enforcement—What happens if I don’t?
Let’s hope to do well enough with the first two so the latter isn’t necessary.
Leverage usage metrics to ensure valuable data products
The bottom line is a simple equation: data plus use equals value. By tracking the use of data products across the entire value chain, we can be sure to build the products data consumers need. Through usage metrics, we can start to drive behaviors that are important to an organization. Increasing use becomes a motivator for the product teams within the domains.
Usage metrics are also important feedback for product teams. They’ll know which data products to focus on, which to refine, and which to retire. From an end user perspective, usage metrics indicate the quality of a data product and perhaps drive more use and more value into the organization. Ultimately, usage reports will establish a continuous process of building, delivering, and evaluating data products to ensure value is created.
For more on the data product journey—from identifying audiences, use cases, and the form of the data product to pricing options and choosing which channels to market—please join the session on Data Commercialization: Your Guide to Taking Data to Market at the Snowflake Summit 2023. You’ll also find plenty of Summit sessions on data apps. Check out Build an App for That: The Next Big Opportunity for Data Entrepreneurs with Mode CTO Benn Stancil, or get a little more technical in the Build Your Snowpark-Powered Data Products and Data Applications with DataOps.live session.