A word from our CTO
A word from our CTO
Narek Verdian, Chief Technology Officer
Our engineering principles are a practical playbook on how we approach new challenges. These are ideas and concepts, not strict rules. Following these principles will allow for a more consistent, efficient, and ultimately healthier engineering organisation.
Our engineering principles
At Glovo we believe that engineering teams should have the right level of power to make decisions and come up with the best technology solution to the business problem they are solving. Engineering principles are the guiding stars that help teams to navigate the ocean of alternative solutions and make sure the whole organization is moving together in the same direction.
Built it well
We understand that at our scale, it's not about implementing new features; it's also about how we build them. Failures will occur and we need to act gracefully. Systems will be compromised and we need to be prepared. The load will increase so our systems need to be elastic and adapt to user demand automatically. With all that, we have to implement systems that are reliable, scalable, secure, and compliant with existing regulations, so that we can succeed in our business, providing our users the best experience and ensuring they feel confident using our products.Behaviors:
- Autoscaling policies are in place.
- Load test process are in place in production.
- Test pyramid is in place and coding standards and security recommendations are followed.
- When possible, API/APP versioning is applied to mitigate risk and enable backwards compatibility.
- The balance between robustness and speed is analyzed without violating the minimum requirements.
Data is the key to success. We should have instrumentation, observability, and diagnostics strategies to have clear visibility of our systems that help us act accordingly. Furthermore, as owners, we should design proactive monitoring mechanisms to automatically be notified about any anomaly before anyone else notices so that we can reduce the impact on our business as soon as possible.Behaviors:
- Monitoring is implied in every definition of done.
- After deployment, check dashboards
- All resources are instrumented in the stack and with the right notifications configured.
- Old monitors are reviewed as part of the development.
- Design and iterate over alerts to detect any change of patterns.
- Be the first to be notified when something isn’t working.
Systems, services, and tables without a clear owner only bring ambiguity to the process and delays any progress. When cleaning or improving documentation, code, guides, or anything we fill in those gaps that can cause blockers along the way. Benefit from inner-sourcing to proceed with any change you need. And be a good citizen... don't add to the future pain of others by creating potentially unreliable or hard-to-maintain systems.
Be able to call out if there is a lack of ownership of a service and also get your own hands dirty to solve the issue even if it’s not part of your scope. Sometimes a Proof of Concept is better than wasting time finding the right owner.Behaviors:
- Long-term documentation vs a quick POC.
- Gather data that will help you fail fast.
- Don’t wait to meet 100% requirements, start small and iterate.
- Blocked by another team? Inner-source!
- Extract or don’t extract from monolith? Extract!
- Be accountable for the quality of the codebase.
- Invest into small daily improvements to keep our ‘house’ clean.
Build it fast
Striking a balance between the short and long-term results is one of the biggest complexities. Delivering fast is not just about the speed in the short-term which is why when building a solution, think about the effect it has on the long-term and consider scalability, tech debt, etc. However, we should also assume we might be working under the wrong hypothesis. Therefore, oftentimes you need to deliver a simple proof-of-concept solution without all the bells and whistles so you can get feedback as soon as possible. That allows you to learn and iterate before investing more time and resources.Behaviors:
- Slice and iterate having the vision in mind.
- Sometimes additional small efforts are needed in order to minimize future work.
- Focus on unknowns first so that we will be sure that the final solution works.
- Design long term, implement short term.
- Be flexible to iterate based on learnings.
No decision we make will be perfect, we need to assume our hypothesis might be biased. Alternatively, no theoretical plan we conceive will be perfect either. As a consequence, we maximize the probability of eventually doing the right thing if we are ready to make frequent decisions and quickly course-correct based on the feedback we’ll collect on experimentation. Setting up the correct monitoring in place will allow us to change our approach and be quicker in favoring reversible over irreversible decisions.Behaviors:
- Always A/B Test, have a kill switch.
- Make smaller decisions without changing too many variables at once.
- Plan for backwards compatibility.
- Assume everything you are going to do will fail.
- Set up correct monitoring to feed your decisions.
Figure it out
We are in a fast-growing business in which time-to-market is crucial, and being able to identify ROI of a new feature is very valuable. However, each system has a different set of requirements: some expose an external interface while others are for internal usage only; some handle thousands of requests per second while others have very limited traffic; some could not afford to lose events while others could live with it. That's why we have to analyze and decide which level of robustness is appropriate for each system based on its requirements and scope.Behaviors:
- Choose the simplest design that meets most of the requirements according to the current state of knowledge.
- Promote Proof of Concept to prove experiments and ideas and collect data to prove it.
- Be conscious of the impact and probability of failure.
We believe in security and compliance as key requirements of every customer problem definition even if they are implicit. We need to keep our customers secure and ensure compliance with the regulations of the regions in which we operate.
A security breach can imply huge cost to the company and we need to be aware of the risks we are taking when we don’t prioritize the best security practices.Behaviors:
- Protect user data like your own.
- Don’t treat security as a trade off, don’t compromise, don’t postpone.
- Security vulnerabilities are priority 0 - they should always come first.
- Challenge the need of PII data in your systems, store it only when absolutely necessary.
Consider the consequences of reusing an existing system vs building something from scratch. Centralization gives relatively small platform teams considerable leverage, allowing them to make improvements that affect the entire company at once. This has efficiency gains, and if done well makes everyone collectively faster in the long run. However, decentralization has benefits as well – autonomous teams can be more flexible and often move faster in the short term. There isn’t a simple answer to how to balance the two; – the trade-off space needs to be continuously watched and tweaked.Behaviors:
- Sponsor other clusters'/groups' work.
- Look for what is available within the company and make a trade-off between costs, functionalities...
- Keep in mind how the product will evolve in order to make correct decisions now.
- Share best practices among other teams.
- Document and tag so other teams can reuse it later.
- Be mindful of costs.