Our AI Team
Sofia
Ivan
Vlad
Anton
Technolody Stack
Full Cycle Development
By Industry
Featured Cases
See All CasesOther projects
Yellow in Numbers
$2.1B+
Value generated through AI innovation47
Custom LLMs and AI agents deployed30M+
Engaging with products we created98%
Projects delivered within agreed budgetToday, we are conducting an interview with one of our leading AI engineers on the realities of building machine learning systems from the ground up. Uladzislau is a leading software engineer with over four years of experience working on enterprise-level AI platforms like Writer and computer vision/natural language processing tools for various businesses. He has seen models fail spectacularly and succeed in ways that entirely transformed client operations.
We sat down with him to talk about the messy reality of AI development. It is one thing to train a model in a controlled Jupyter notebook environment. It’s an entirely different beast to make that model work reliably for thousands of users. We discussed his biggest failures, his favorite tools, and the lessons he wishes he had learned years ago.
“What is one unexpected challenge you encountered while developing your first AI model that textbooks didn't prepare you for?”
— The sheer amount of garbage data you have to deal with. When you take a machine learning course, you usually get clean, perfectly labeled datasets. You download a CSV file, run a few functions, and everything just works. In my first real AI development project, the client handed us a database of customer images that was an absolute disaster. Half the images were blurry. Some were just pictures of the floor. The labels were manually typed by different employees who used entirely different naming conventions. I spent two weeks just writing scripts to clean the data before I even thought about training a model. Textbooks, unfortunately, don't teach you that 80% of your job is acting like a digital janitor.
“How did you overcome it, and what would you tell someone just starting?”
— I had to stop trusting the data completely. I started writing defensive data pipelines. I built validation checks that threw aggressive errors the moment an image was the wrong resolution or a label didn't match a strict predefined list. If you are just starting, my biggest piece of advice is to never assume your data is ready. Look at it manually. Print out random samples and actually look at them with your human eyes. One of the core AI development challenges is that a model will happily learn the wrong patterns if you feed it bad data. You have to be paranoid about what goes into your system.
“Can you share one debugging technique that has saved you the most time in AI development?”
— Overfitting a single batch of data. Whenever I write a new model architecture, I take just five to ten examples from my dataset and try to train the model exclusively on those. I turn off all regularizations, and I just let it run.
If the model is built correctly, the loss should drop down to exactly zero. The model should perfectly memorize those five examples. If the loss does not drop to zero, there is certainly a fundamental bug in the code. Maybe the matrix dimensions are misaligned, or the gradients aren't updating properly.
“What made this approach more effective than others you've tried?”
— It isolates the problem. AI debugging techniques are notoriously difficult because failures are silent. If you write a web app and mess up the code, the app crashes. You get a stack trace. If you mess up a neural network, it usually just keeps training but gives you terrible predictions at the end.
Before I started overfitting single batches, I would train a model on the full dataset for six hours, only to realize the accuracy was stuck at 50%. Then I had to guess if the data was bad, the learning rate was wrong, or my code had a bug. Overfitting a tiny batch takes ten seconds and proves the mathematical plumbing actually works.
“What is one tool/framework you wish you had discovered earlier in your AI development journey?”
— For large enterprise projects, it’s "Weights & Biases". It lets the team manage experiments and share results with other teams easily. When you reach a certain scale, MLOps becomes a "must," and having a centralized platform for experiment tracking and model registry makes it much easier to maintain reproducibility and simplify collaboration.
“How has it changed your workflow since you started using it?”
— Just made it more structured and reliable. Instead of manually tracking experiments, you can automatically log parameters, metrics, and datasets. Any changes in performance can be quickly tied to underlying causes. You can set up a model registry with clear promotion stages so that only validated models can reach production. Overall, it’s a part of the transition from ad-hoc experimentation to a more mature MLOps process.
“What is one ethical consideration you had to address in a recent AI project?”
— Copyright is a big one. It’s a bit simpler for general-purpose models. But with more specific applications like audio and video generation and coding, there is less respect for copyright in these domains. There is a Stanford study where they define and measure a transparency index for models. There are only 3 out of 13 models that have good scores on data acquisition.
“How did you handle it, and what guidance would you offer to others facing similar situations?”
— First, there are a ton of datasets with works in the public domain. Then, just do your best to check licenses and codes of conduct on anything you try to scrape.
“Can you describe one instance where your AI model performed differently in production than in development?”
— I built a computer vision model to detect defective parts on a manufacturing assembly line. In my development environment, the model had a 98% accuracy rate. I showed the client, they were thrilled, and we deployed it. The first day in the factory, the accuracy plummeted to around 60%. I was terrified. It turns out, I trained the model using high-resolution photos taken with a flash. The cameras actually installed on the factory floor were cheap, low-res webcams, and the lighting completely washed out the metal parts.
“What did you learn, and how do you now prepare for deployment?”
— Now, I refuse to train a model until I get sample data from the exact hardware that will be used in production. If the client is going to use a cheap webcam, I want my training data to look like it was taken on a cheap webcam.
“What is one unconventional data preprocessing technique that significantly improved your model's performance?”
— Intentionally making the training data worse. It sounds completely counterintuitive. Usually, you want your data to be as clean and perfect as possible. But I started aggressively applying data augmentation that degrades the image quality. I wrote scripts to randomly blur images, drop the contrast, add digital noise, and artificially pixelate them. I basically forced the model to learn from really ugly, degraded pictures alongside the good ones.
“Why do you think it worked so well?”
— It forces the model to learn the core features of the object rather than memorizing high-frequency textures that might disappear in bad lighting. When you do AI model performance optimization, you are trying to make the model robust. If it only learns from perfect photos, it won’t work on a blurry one.
“What is one collaboration challenge you've faced when working with non-technical stakeholders on AI projects?”
— Managing their expectations of what AI can actually do. There is so much hype right now. Non-technical stakeholders read a headline about a massive new language model and assume AI can do everything.
“How did you bridge the communication gap?”
— Instead of trying to explain how a neural network operates, I show them the output probabilities. I say, "The model is 85% confident this document is an invoice, but only 40% confident it has the right total amount. Do we want to automate a process with a 40% confidence rate?" That reframes the conversation. It turns a magical black box into a standard business risk assessment.
“Can you share one resource or learning method that accelerated your understanding of a complex AI concept?”
— Re-implementing academic research papers from scratch without looking at the author's code. When you read a paper, everything makes perfect sense. But when you actually try to write the code using PyTorch or TensorFlow, you realize you don't actually understand how the tensor dimensions align. For a long time, I just relied on high-level APIs. I didn't actually know how an attention mechanism worked. I just imported it. Sitting down and writing a Transformer model line by line from an original paper completely changed the way I work.
“What made it particularly effective for you?”
— It forces you to deal with the messy reality of math. You have to figure out the exact shapes of the matrices. You have to handle the edge cases that the paper glossed over.
It is frustrating and painful, and it takes days. But once you build it yourself, the concept sticks in your brain forever.
Building AI systems is deeply chaotic. The math is exact, but the application of that math to the real world is incredibly messy. You will spend days hunting down silent bugs. You will argue with stakeholders about what a model can actually achieve. You will deploy a system that works perfectly on your laptop and completely fails the moment a real customer touches it.
But when it works, there is nothing quite like it. Watching a machine learn a complex pattern and execute a task autonomously is an incredible feeling. The key is to stay skeptical of your data, test aggressively in real-world conditions, and remember that an AI model is just a tool.
Got a project in mind?
Fill in this form or send us an e-mail
Get weekly updates on the newest design stories, case studies and tips right in your mailbox.