May 29, 2025
By Hubert Brychczynski
Artificial Intelligence,
Frontend Engineering,
Backend Engineering,
Generative AI,
Software Engineering
Here’s an unpopular opinion: everybody is using large language models. Few admit it. Fewer still endorse it.
Translators routinely query ChatGPT about tricky phrasing. Content creators (like myself) use it to refine their work (as I did for this article - talking about you, ChatGPT!). And software engineers - including those at Janea Systems - have been leveraging generative AI to streamline their tasks for quite some time.
Yet, people often decry, dismiss, or ridicule large language models - despite almost certainly using them when no-one’s watching. We’re eager to optimize our workloads with AI, quick to mock its mistakes, and reluctant to share our original work with it.
The ambivalence toward AI stems, in part, from its very design. Large language models are probabilistic. Almost every public LLM interface (with the telling exception of Grok) warns users to double-check outputs for accuracy. Meanwhile, a growing body of research - driven largely by journalists and scientists - challenges industry claims about LLM reliability, exposing just how often even the best models hallucinate.
That being said, both you and I know from experience: sometimes, large language models are genuinely useful.
The real question is: when, and how?
This is what we set out to determine in our recent experiment.
The experiment consisted of two phases:
What we found is enough to fill two posts:
Here’s what you can expect to learn:
We selected four expert-to-senior-level engineers per domain. Each engineer solved four domain-specific problems: two unaided and two with AI assistance.
Afterward, engineers filled out quantitative and qualitative surveys, providing data on time spent, tool usage, and their personal experiences with AI assistance. They also assessed their domain expertise, the skill set required for each task, and familiarity with prompt engineering techniques.
Front-end and back-end engineers grappled with the following tasks:
On average across all domains, engineers finished tasks approximately 3.2 hours faster with AI, translating to a 30% reduction in completion time—from 9.56 hours to 6.36 hours per task.
Engineers praised AI’s ease of use but consistently noted that AI-generated solutions required modification before they could serve as viable proofs of concept.
Figure 1 illustrates the domain-by-domain speedup. The largest gains occurred in front-end and back-end tasks—66.94% and 55.93%, respectively.
Fig. 1: Task performance improvement across domains
Figure 2 shows the proportion of AI-generated solutions that worked “out of the box.” Back-end engineering led with a 100% success rate, followed by front-end engineering at 87.5%.
Fig. 2: Percentage of AI-generated solutions working out of the box
A functional solution isn’t always a polished one. Back-end solutions may have worked immediately, but engineers still spent time refining them.
Figure 3 reflects engineer self-assessment of time spent improving AI-generated solutions, where “1” indicates extensive time spent and “5” indicates minimal time.
Fig. 3: Time spent improving AI-generated solutions
Figure 4 reflects engineer self-assessment of the number of changes made to AI-generated solutions, where “1” indicates few changes and “5” indicates many.
Fig. 4: Number of changes made to AI-generated solutions
Higher proficiency in prompt engineering and greater domain expertise correlated with better outcomes when using AI for coding tasks. However, only one in four engineers had studied prompt engineering prior to the experiment.
Table 1 presents average self-reported assessments of expertise, tool proficiency, and prompt engineering familiarity across all domains.
Table 1: Engineer self-assessment
Front-end engineers used AI for research, reference, guidance, and rapid prototyping. They found AI particularly useful for tasks involving visual elements or interactive components, such as brainstorming UI patterns or explaining complex DOM manipulations.
However, AI-generated solutions often violated accessibility and performance best practices, introduced inconsistencies and unnecessary complexity, or created unintended dependencies, which required careful verification and refactoring.
Back-end engineers used AI as an advanced autocomplete and conversational reference. The models quickly produced code suggestions, reference snippets, and structural templates, saving time on boilerplate, debugging, and manual research.
When AI-generated code contained errors, engineers often struggled to trace the root causes, since they had neither authored nor fully reasoned through the code themselves. Moreover, some AI-generated solutions proved outdated or inconsistent with internal standards, introducing minor compatibility issues.
The numbers speak for themselves: AI gives front-end and back-end engineers at Janea Systems a staggering advantage. Performance gains of 66.94% and 55.93% mean we can deliver projects much faster than AI skeptics. And we have the data to prove it.
What about the other domains?
Stay tuned for the next article to find out.
As an AI-turbocharged software engineering company, we harness human-AI synergy for superior, accelerated software product development, research and prototyping. Here are three recent examples:
We developed an exploratory prototype of an LLM-powered fact-checking tool, using a segmented architecture and off-the-shelf components to deliver rapid results in just three months.
We are actively optimizing a cutting-edge chatbot platform that helps sport scouts find and acquire talent more effectively.
We integrated the Semantic Kernel feature to enable generative AI for the Advanced Paste tool, resolving 25k issues and reaching 118k stars on GitHub.
From prototype to production - let’s put AI to work for you.
Ready to discuss your software engineering needs with our team of experts?