Technical Debt In Machine Learning — An Introductory Note

3 min readAug 21, 2022

What is Technical Debt:

According to Wikipedia, “In software development, Technical Debt (also known as design debt or code debt) is the implied cost of additional rework caused by choosing an easy (limited) solution now instead of using a better approach that would take longer”. That means whenever we develop and deploy something quickly to meet that tight deadline, we incur a future cost due to multiple reasons like lower productivity, rework, or additional operating cost. Just like financial debt, technical debt also accumulates with time and may completely kill a project due to the old decisions taken during quick implementation. Hence, paying off Technical Debt, if any, by its due time is an absolute requirement.

Why it is even more important for ML pipelines:

Machine Learning pipelines have all the qualities of software development. Additionally, the uncertain nature of the algorithms and data makes ML more susceptible to Technical Debt. So much so that it can create months’ worth of technical debt in a matter of days.

What causes Technical Debt in ML systems:

ML itself — If not used properly, ML itself can create Technical debt. Not all problems need an ML-based solution. So, if an ML-based solution is implemented without putting much thought into it or just by going the media hype, it may incur unnecessary future costs of maintenance (note that ML requires a lot of monitoring and maintenance on an ongoing basis)
Data — Change in the data or data source over time, using irrelevant data or rubbish features, can cause an ML pipeline to fail over time. Remember, more data is not always relevant data.
Monitoring — Monitoring model performance over time is crucial in an ML system. If not done in a proper way and on an ongoing basis it can incur huge costs in the future.
Low code quality — Lack of automation, using hard-coded variables or values, and lack of proper documentation will lead to a non-maintainable code incurring a future cost.
Feedback loops — A feedback loop is when the output of one ML model is fed into the same model as the input. If anything goes wrong at the initial stage of deployment, it will quickly start accumulating and the error will grow exponentially.
Correction cascades — When the ML model does not learn to give the desired output, we often end up manually fixing certain parts in order to quickly deploy the system thinking that we will fix it later. If done without proper checks in place, it can cause huge issues at a later time.

How to manage Technical Debt:

Technical debt is not always a bad thing. A lot of time we require to develop and deploy something quickly due to business requirements. However, not being aware of it is dangerous. We need to be cautious about the technical debt, know what in the current system can cause it in the future, monitor it continuously, and take the necessary steps to pay it off. Some of them are as below:

Selecting the right, relevant, and quality data are very important along with continuously monitoring it for any data drift.
Proper documentation of the code, algorithm, and deployment method is extremely important along with proper version control in place
Stress testing of the data and algorithm for different adverse situations
Automation of the code and a robust model performance monitoring system in place.

Summary:

In summary, a lot of time Technical Debt can be unavoidable. However, it should be a deliberate decision and a well-calculated risk. In case it is an inadvertent one, taken by some ameture, it can pile up really fast and potentially even kill a project.