From latent spaces to JWTs: how agents taught me backend

I wrote code every day for years and somehow felt like I wasn’t really writing code. Let me explain.
I’ve been working as a data scientist since 2019. Even within that relatively short window, my areas of expertise have constantly shifted, driven by evolving ML techniques, changing market expectations, and the different roles I’ve taken on across companies.
My AI work started with computer vision, moved to tabular data, then big data analytics, then early language models, and with the rise of generative AI, diffusion models. Almost every year brought a new domain to specialize in.
Having started in what I’d call the early days of data science, I watched needs emerge rapidly and solutions get standardized at nearly the same pace. Tasks that were once considered serious engineering challenges, like license plate recognition or defect detection on production lines, are now short-term projects with reliable outcomes. Some problems have been so thoroughly solved that major cloud providers ship them as AutoML packages. This constant commoditization meant data scientists always had to keep moving, picking up new domains and new problem-solving skills.
A data scientist works differently from a software engineer. Yes, you use a programming language, but the programs you write are essentially elaborate ways of operating complex calculators. (Generalizations are always wrong, but this is largely why data scientists live in Jupyter Notebooks, the flow is top to bottom.)
In the latest phase of my career, going deep on diffusion models, I found myself working with far more complex codebases than typical data science workflows demand. I was writing code rooted in mathematical optimization, computing algebra in latent spaces, working with neural network architectures at a level that required almost software engineering foundations. It was exciting work.
But here’s the tension: in data science and deep learning, you only ever brush up against other engineering disciplines. At best you do some API consuming, data scraping, maybe a bit of Docker. Authentication, object-oriented design, microservices, web frameworks, REST APIs, databases beyond writing queries: none of that is your priority. Your depth goes into the mathematical side and state-of-the-art deep learning model architectures, because those fields alone are deep enough to spend a career in.
I was deepening on the math side while remaining shallow on the engineering side. That’s what I mean by writing code every day and feeling like I wasn’t really writing code.
Then I found my way out: AI agents
I asked my manager if I could drop diffusion models entirely. All I wanted was to build agents. It was the right opportunity to learn the engineering fundamentals I’d been missing. Building agents would force me to ship a real product and push me into backend territory.
For those unfamiliar: agents are semi-autonomous programs that use LLMs to perform tool calling. You define a set of capabilities as functions, and the agent translates natural language into sequences of those function calls. They query databases, search the web, read documents, run calculations, really any function you can implement. Then they fold the results back into their context and respond in natural language. Building an agent means building all of these systems yourself.

What I didn’t anticipate was how quickly agent development would expose every gap in my engineering knowledge. The ML parts were comfortable. Everything around them was not.
The first wall was security. As a data scientist, I had never once needed to think about how two services authenticate with each other. Now I had to figure out how my agent would communicate securely with my backend, how my backend would talk to external services, and how I’d protect the entire system once it was exposed to the internet. I went from barely knowing what a JWT was to implementing RS256-signed token flows, not because I wanted to learn cryptographic signing, but because my agent couldn’t securely call a single tool without it.
Then came async. My agent needed to call multiple tools, some of them slow, and I couldn’t have it blocking on each one sequentially. So I had to learn async/await patterns in Python, not the toy examples from tutorials, but real async database sessions, connection pooling, and handling concurrent tool executions without race conditions. FastAPI made the async-first approach natural, but that also meant every dependency I plugged in had to play nicely with async, and many didn’t.
Each of these problems cascaded into the next. Securing the API forced me to understand middleware. Middleware forced me to understand the request lifecycle. The request lifecycle forced me to understand how FastAPI’s dependency system actually works. Seven months in, I can trace a request from ingress to database query to agent tool call and back, and I understand why every layer exists.
The Pivot
I’ve been doing backend development for about 7 months now. I wouldn’t call myself an expert, I’m not, but building agents taught me what software engineering actually feels like. Not the calculated, cell-by-cell flow of a notebook, but the layered, interconnected, sometimes frustrating work of making systems talk to each other reliably and securely.
Every gap I had while growing as a data scientist, backend work filled. It made me fall back in love with programming in a way I hadn’t expected. The industry has settled on a term for this intersection: AI Engineer, someone who works across both backend systems and AI. That fits, and I’m glad the role exists.
That’s my story of pivoting from data science into backend through agent development. If you’ve had a similar path, skills you picked up that I haven’t mentioned, gaps you didn’t know you had until you hit them, I’d be interested to hear about it.