I am not an AI expert, nor are you.
How self-proclaimed experts risk poisoning the AI ecosystem.
It is pretty common to find AI experts almost everywhere. From podcasts to blog posts, LinkedIn news. People re-posting headlines and news. Often amplifying the noise of marketing announcements with no real attachment to practical use cases.
This behavior, quite common during every technological revolution, can be highly disruptive when dealing with AI.
The problem is that in the domain of artificial intelligence, there is an enormous bias between PoC or theory and the hard reality. This, matched to a fast-moving state-of-the-art and some challenges still unsolved, amplifies the perception that with AI, nothing works in production.
There are a number of factors that create and maintain this perception: first, the ease of building a working prototype, which leads people to believe production is just a few days away. This is fueled by people living under the Duning-Krueger paradox, which underestimates the complexity of reality. Second, there are benchmarks that can be really misleading. I have been told far too many things that we should do X because ChatGPT does X.
Ok, repeat after me: ChatGPT is not a benchmark because their investments in just building that simple application are far greater than what a normal company can put on the table. This translates into a heavily optimized solution that appears simple at first glance. In addition, tech teams within OpenAI can have access either to top-level AI professionals or to the knowledge about how their LLM works under the hood.
So, let’s go back to reality: I completed my PhD in 2010, working on neural networks and computer vision, but I do not consider myself an AI expert. I have trained neural networks on GPUs and in the Cloud, faced heating issues, and allocated processing slices. I have written PyTorch implementations. Yet I do not consider myself an expert. This is because I am not an expert.
Every day I learn something new. Every day, I find out there are things I do not know. Things that have a real impact on real issues. And I am not speaking only of manual model training or such things. I am pointing to just the managed services out there. Each one with parameters that distinguish feasibility from unaffordability.
In AI, even tasks such as “document OCRs” can be tough when tested on real-life documents, which are often handwritten, rotated, and sometimes blurred. In the last few months, just to read legal contracts, we had to match a whole set of requirements in terms of speed, cost, and accuracy. Everything seemed super easy on paper, but when faced with reality, we found we needed a dataset of “hard documents,” which are not hard at all for humans. We tried parameters, and every time we matched a requirement, another piece of the puzzle stopped working. We needed to define a metric to build KPIs and evaluate different methods across different dimensions. We needed to pre-process the document, and discovered that there is no one-size-fits-all solution.
We struggle with context size and meaningfulness, face size constraints, and seek techniques to improve the quality of data managed by our agents. When we fail, the price is poor accuracy, which in the legal domain translates into customers not being able to use our tools to accomplish their tasks. Then we reiterate, change the model, deal with Bedrock quotas and system prompt accuracy, only to finally find out we need to select our context chunks using a RAG, but we still need to rewrite queries, filter, and rerank chunks. All of this within an acceptable iteration time.
I work every day with people from Microsoft and AWS who collect feedback from our use case, provide suggestions, and improve their products in the meantime. Then our R&D department has to continuously monitor new papers published at top-level conferences, prepare test cases to evaluate results, and identify what should be included in our product roadmap.
Then I stumble into so-called experts. People often deliver engaging talks at conferences about how magical yet powerful this AI world has become. I try to ground their very opinionated sentences to our use cases, but after having barely listened to two sentences about our problem, they have identified a 100% working solution.
I dive into their expertise only to find out they have “production expertise” because they can run a model on their PC with Ollama or LMStudio. Buying a $6K laptop and running in the terminal
uvx --from mlx-vlm mlx_vlm.generate \
--model mlx-community/DeepSeek-OCR-8bit \
--prompt “Convert this document to Markdown.” \
--image “$IMAGE_PATH” \
--max-tokens 8192 \
--temperature 0.0
doesn’t make you an expert. I have a 2600-page document to be processed in less than 3 minutes. A use case requiring at least multiple GPUs, prefetching, and a vLLM deployment. And this is just one example of the many modules we have in our Adaptive Legal Assistant LIDIA: legal operations workflow management through agents is another optimization pain point: context is very limited when you have tens of documents with hundreds of pages each. And context rot is waiting at the end of the road to remind you that “a bigger context” doesn’t solve your issues. Then you have to add a RAG to the equation, but in order to be effective, you need metadata from your documents and a proper chunking technique. Just to begin.
But our “advanced experts” aim to write a sufficiently long prompt for ChatGPT and are ready to explain to you that they did that in production, as they show in their presentations: loading an Excel file into the chat and summing the values in a column.
Let’s set the expectations: running a model on your PC does not make you an expert. Bringing the model into production on multi-GPU hardware, with optimized response times, makes you an expert.
Writing a prompt, however complex, does not make you an AI expert. The state of the art is agent development with context engineering, tool calling, and agent swarm orchestration. I know how to do all these things; nonetheless, I am struggling every day with the constraints of the state of the art.
Yet I am not an AI expert. And I can tell that you are probably neither.


