Story Points Calculations image

How to Calculate Story Points – Unravelling the Estimation Controversies

Story Points are still yet another of the great mysteries in the Agile world.

They are such a mystery that they gave birth to the #NoEstimation movement. A part of the Agile community has indeed found better ways to answer when the job will be done, but most of us still struggle to give correct story point estimations. We struggle because we don’t know how much exactly is one story point and what it means for delivering the product increment. 

I have a formula for you, so keep reading. 

If you google how to calculate story points, you’ll find dozens of articles about how to count story points, how to map them to hours, and why you shouldn’t do it!; how to track the burning of the stories, or was it hours… There are also enough attempts to explain that the story point is a relative measure that represents size, effort and complexity. But how to combine them in one story point? It depends on your context, and here’s how.

Misunderstanding story points

Most of the beliefs about the story points are myths and delusions. I haven’t seen a developer who thinks he’ll do his work better if they estimate it with story points beforehand. It’s a typical manager’s mistake to believe that introducing story points estimation will magically build predictability in their release process. It’s a huge misunderstanding, just like the following ones.

Story points are unbiased and overcome the subjectiveness of estimating in hours

Giving estimations is subjective regardless of the unit. It’s complicated and error-prone. I have always felt uncomfortable giving estimations, no matter how well I know the work. Estimating new features and bugs is totally different, and you can’t break the tasks down to lines of code. Theoretically the smaller the item, the smaller the error, but again – you need to deliver a product increment, not something small enough. 

When I don’t have enough information, I always give a much much larger number. When I feel that the rest of the team will give a small one, I tend to follow because of the peer pressure – I don’t want to be the one who always gives large numbers, nor look incompetent. Some of the team can have absolutely no idea what this story is about, and they would just shrug and agree with the rest of the team. Throw your numbers in private or anonymously, there are enough tools for that.

On another note, the planning poker can turn into a consistently poor experience for the team, if you don’t know what these numbers mean. 

Here’s why:

  • the team members feel insecure giving estimates for things they don’t completely understand; 
  • teams are supposed to discuss and clarify, but that doesn’t always happen, and we might get left out feeling that we don’t contribute to the decision;
  • it leaves the taste of wasted time – we could have put any number there – it doesn’t matter.

The greater purpose of story point estimations is to make sure the whole team can share and discuss their considerations.

I’ll tell you what happens to me. I’ve been on the testing side for so many years now, that I don’t see features anymore. I see bugs. In most of the standard software functionalities – I know what bugs we’ll find. And usually, I add a couple of story points for the expected bugs.

Using relative story points deals with uncertainty

Several numbering patterns are used for story points – the Fibonacci set, sequential counts or t-shirt sizes. Using these patterns makes us believe in relativity and that using these patterns makes the estimates authentic because it deals with the incredible amount of uncertainty. The truth is – we have no idea how much work it is indeed. And the worst is that hiding behind these relative story points we miss the opportunity to unravel as much as we can; decrease the uncertainty, and improve team awareness. 

We are supposed to identify a story that represents one story point and plan around it. It’s not convenient – one story point will be different in every sprint, and we can’t really always keep in mind which user story was that. Most importantly, the team could be working on different products which cannot have the common one. Hence, the unit of one story point is actually different in every sprint. How can we trust the velocities then?! 

Relative story points bring no valuable information. That’s why they don’t have to be relative. We can calculate them instead by revealing the layers of work and risks. Discussions and collaboration deal with uncertainty better and bring meaning to the random numbers.

Story points provide predictability

The strongest argument for using story points is that they provide predictability, and they defend the team from overburden.

Well, without knowing how much it will really take and what is the process of delivering this potentially shippable product increment to the client – we can’t talk about predictability at all. Moreover, we can’t speak about predictability when the team changes. 

The only way to become more predictable is by using objective historical data.

And last but not least, even the most accurate estimations cannot save you from unpredictable events, last-minute requirements and insensitive management interference. 

Story points don’t depend on the skills of the team members

A cross-functional team might be a dream for a team of superhumans who can draw designs, code on all levels and languages, test manually and automatically on all levels and… devopssec. But in real life, these are different team members and not so rare different teams. So yes, it greatly matters who will do the tasks. The more I know the team, the more I adjust my estimates if it’s clear who will do the coding. 

Story points are estimates for user stories. The user stories represent a piece of user value – which is the potentially shippable product increment.  Aiming to satisfy the Definition of Done, we need to consider all the different tasks (skillsets) but estimate the whole story.

Story points are a tool for collaboration and risk mitigation rather than estimation

If you want the whole team to improve their competency about the work needed to complete any story and start giving meaningful numbers that will decrease the risk of delays – pair them. Pair design, programming & testing, deployment. Encourage team members to do the tasks together. 

The designer will grasp why developers complain about designs, why they are challenging to implement, and why there are bugs.

The front-end engineer will be able to follow the flow of thought of the designer and understand better the user needs and the next time they will be able to take UX work into consideration when giving the estimate.

The tester and the developer will correct an incredible amount of defects together very early in the process, making this one of the most cost-efficient development practices. It will save the time to build, deploy to test environment, plan for testing, test execution and reporting, plan for fixing, fixing, deployment to the test environment, plan for retesting, retesting… You can do the math yourself how much a pairing session will save. 

When we talk about increasing the team awareness, an unpopular practice is to invite the stakeholders to your planning sessions. I can’t imagine this because stakeholder management is the blurriest area, and requires impossible levels of transparency and trusts. But work towards building such relationships and be sure that if your stakeholders know how much it costs to deliver what they want, the whole team will live a happier life. 

 Let the team sit together and deliver a shippable product increment.

Working together is fun. It’s the best team-building game, it brings great satisfaction and feeling of accomplishment, and that feeling is much stronger than if you do it alone because it’s shared.

Business acceptance criteria, testing criteria and UX criteria

Agile teams often underestimate the acceptance criteria while they can make the whole development process tremendously more effective and efficient.

Before the planning session:

  1. Make sure that every story is ready for planning.
  2. Use the backlog refinement sessions, for example, to add the business, testing, UX or other acceptance criteria as a compact and unambiguous checklist to the user story.
  3. Announce the story as ready for planning (can be a status on your board) and let the team get familiar with these criteria before the planning/estimation meeting.

The team will have the chance to think in private and come up with questions. And they will now be able to make a conscious and validated decision about the story’s estimate.

Something general but fundamental – whatever practice you adopt – save your process initial state – where you start and define how you’ll measure if the practice has improved the process. 

Let’s now make the best out of sprint planning and story point estimations.

How much is one story point?

It’s as much as you decide within your business, product, technical and team context. But it would be best if you accounted for a couple of factors. You’ll find and adjust the ones that best work for you over time, but here’s an example.

When estimating we tend to think positively – we think about building things, we don’t think about bad weather, and we always give the best-case scenario number, which is the smallest – that’s a 1. Well, it’s rarely a 1.

The components of a story point

There are some estimation strategies. The one described below is perhaps yet another one, but it’s pain-based, and it accounts for the unknown. The problem with any estimations is that there is no such thing as clairvoyance and we can’t predict what will happen in reality. We can only try to think upfront for as many factors that can increase the risk (and refer to historical data). 

Breaking down the story point into the following components let the team use their expertise effectively and with higher confidence.

Complexity

Complexity is how much time we will have to think before writing the code or writing the tests. Complexity accounts if it’s an innovative deep algorithmic task or it’s a relatively trivial task. Do we expect it to be difficult to draw, code or test because of twisted logic, too many exceptional cases to handle, managing data of great variety? Has anyone done this before? Does the feature has non-deterministic behaviour?

We may decide that one story point of complexity is a task that we have done and tested many times before, not more than 2-3 conditions to cover, doesn’t need any research or preparation, doesn’t depend on the technology. 

Volume

Sometimes, it’s not complicated, but it’s just a lot of work – many screens, many test cases, many data sets. It can be a low complexity task but a massive amount of time-consuming work. Do not underestimate it because its complexity seems low.  

Points of change

This measurement might be in addition to high complexity or to a large volume or it can be a relatively straightforward task that requires changes in several architectural layers. 

The Points of change is my favourite because that’s the one that raises the most questions and can prevent the worst risks. It helps identify the impact of the changes; detect challenging to manage dependencies and consequences, and is invaluable to the test plan. 

Dependency

Managing dependencies is the most challenging task. Even the extreme indie/garage startups have external dependencies – tools, for example. While tooling seems manageable, expecting input from people external to the team is impossible to predict. Well, these are all of your customers and you can’t avoid them. Do your best to start provisioning the external dependencies as early as possible. Have a plan B and C. Have clear priorities and make sure you rate that dependency because if not accounted for, it will either delay your delivery or overburden your team, or both…or worse.

Competency

Competency and availability are related to the complexity, points of change and dependency, but they treat who can do the job. No matter how genuinely cross-functional your team is (there’s no such thing in real life), some team members will always be more specialized in one area than others. Even if you leave the task unassigned, this still doesn’t mean that anyone can do it, and for sure it doesn’t mean that this (any)one will not have to take an unplanned day off. 

Story point calculation in practice

Open your backlog, arranged by priority, and add the components of the story point as columns. Then add one column for the average. 

User story by user story, play the planning poker anonymously for each of the components. If the numbers are close to each other, you can safely use the average for the whole story. 

Encourage your team to discuss if there are outliers. The outliers for each component are the highest risks. If a team member has given a much higher number consciously and because of their knowledge and has sufficient arguments, it’s better to adjust the overall estimation up. If it’s because of insecurity – continue the clarification, or make sure they are provided with the chance to learn more about the feature. 

Do not finalize the estimation unless the team has accounted for all activities necessary to deliver the product increment – communication, organization, research, UX, UI design, software architecture and design, coding, refactoring, test design, testing and reporting, automation, bug fixing, retesting and verification, deployment. What else?! 🙂 

That’s not all for sure, but it’s enough to get the conversation started. Is your team comfortable giving estimations? Are you all aware of how much one story point is? Do you see any predictability indeed or it’s a big fat illusion? The point is to understand what story points mean for your team and make use of them, if not – you might need to reconsider that tool. 

Working with clear and truly relative story points can increase team collaboration, motivation and morale. It brings confidence to the team and soothes the uneasy soul of the product people. If done right, if we continuously learn from every story, it builds trust and improves the internal and external relationships through the clearer and more consistent communication.