AI-Assisted Software Development Across the SDLC

Generative AI helps with a wide range of tasks: writing stories, creating images, composing music, and coding. Just tell it a few words, and your idea becomes real in seconds.

When it comes to AI in software development life cycle, AI tools have become handy replacements for what we used to rely on for finding information and solving coding issues — Google Search and Stack Overflow. However, the question regarding the AI efficiency remains open. We conducted an experiment to see if AI can be a trustworthy coding assistant.

We focused on the two best AI tools for software developers — ChatGPT and Copilot. Our team assessed their accuracy and ease of use across simple and advanced cases, as well as the time savings they bring. We’re ready to share our findings with you.

Experiment Design

Our research compared ChatGPT and GitHub Copilot across various software engineering tasks. We wanted to find out if they could handle coding tasks on their own, acting as assistants who don’t require detailed instructions each time.

For each challenge, we asked participants to fill out a survey that covered a few key points:

Solution accuracy: How correct was the solution on a five-point scale?
Time efficiency: How quickly did the tool provide results, and how much time did it save?
User experience: How easy was the tool to use?

We tested ChatGPT and GitHub Copilot on a wide range of programming problems:

Simple algorithms (finding maximum values in arrays, reversing strings, and checking for prime numbers)
Complex algorithms requiring advanced computation
Code refactoring
Bug fixing
Boilerplate code generation for standard project structures (REST APIs and basic web applications)
Unit test creation
Documentation and email creation

Throughout the AI-assisted software development process, each engineer used both AI tools on the same tasks, providing brief instructions and clarifications when needed. We gathered both numerical data, like accuracy and time saved, and qualitative data — participants’ feedback to receive a full picture of Copilot and ChatGPT coding.

Our Experiment Findings

Our tests revealed that both AI tools for software development performed well on simple tasks. They’re able to quickly generate accurate code for basic algorithms and standard setups. On the flip side, more complicated problems and specialized challenges needed more guidance. We had to make adjustments to the prompts to receive better results.

Overall, we collected 99 responses, and our findings are the following.

Solution Accuracy

GitHub Copilot did a great job with simple algorithms, managing to solve all 10 basic challenges without any corrections required. ChatGPT excelled at text-based tasks. It created clear documentation, effective command-line scripts, and professional emails on the first attempt.

When we moved on to more complex problems, we had to tweak our prompts multiple times. Still, their initial suggestions of the AI developer tools served as a very good starting point, which made improvements relatively easy.

Participants assessed result accuracy according to a five-point scale (where 5 equals fully correct and 1 equals fully incorrect). Here’s the summary of our findings:

Fully correct (5): 63.2%
Mostly correct (4): 28.4%
Partly correct, partly incorrect (3): 4.2%
Mostly incorrect (2): 3.2%
Fully incorrect (1): 1.1%

Time Efficiency

Generative AI tools for software development are complete game-changers when it comes to saving time. Developers who previously spent hours searching documentation and debugging simple errors can now fix these issues in minutes.

Copilot takes this a step further by integrating right into your code editor. This way, we didn’t have to switch between apps. Suggestions popped up directly in our workspace, helping us stay focused and keep our flow going.

User Experience

Both ChatGPT and Copilot were incredibly user-friendly, no steep learning curve. Our team adopted them quickly, not just for coding, but also for writing documentation and generating emails. Plus, the code generation tools responded almost instantly; we’d send a command and see results a few seconds later.

ChatGPT vs. Copilot Comparison by Task

Overall, ChatGPT and Copilot achieved a 94% accuracy rate, successfully completing 48 out of 51 assigned tasks.

AI Assisted Software Development Limitations

Although our experiment brought exciting results, we also noticed several limitations that can impact our further work with ChatGPT and Copilot:

Lack of real-world context: We didn’t test these tools in real-world, hectic coding situations where deadlines are tight and requirements change quickly. In such scenarios, software development AI tools might behave differently.
No long-term evaluation: We evaluated the immediate effect of working with the AI coding assistants, but we didn't track what happens eventually — whether AI-generated code makes ongoing maintenance harder, or if it changes the way developers learn and grow.
Subjective time measurements: When we assessed how much time Copilot and ChatGPT save, we asked developers how quick it felt, which isn’t a precise measurement. What seems fast to one engineer might just feel normal to another, so our numbers on efficiency aren’t as objective as in the accuracy testing.

Our Recommendations for AI-Assisted Coding

We started our experiment with a simple question: Can software development AI tools deliver reliable results? After testing ChatGPT and GitHub Copilot, we have our answer — and it’s a clear yes.

At the same time, our experience also challenges the common idea that AI will completely replace human programmers. While these tools are getting faster and more accurate, we noticed some limitations in the AI-driven software development.

For example, they tend to be less precise in complex tasks or may not take into account nuances like integration problems. Therefore, smart, experienced developers behind the wheel are what make ChatGPT and Copilot truly effective.

We're encouraging our development teams to adopt AI coding assistants, and we want to share some practical advice for anyone who’s going to get started with ChatGPT or GitHub Copilot or take their usage to the next level:

Leverage generative AI for software development: These tools are great for handling boring, repetitive tasks that can drain your energy. Run AI unit tests and let ChatGPT and Copilot take care of fixing bugs and tidying up your code.
Improve workflow integration: To significantly improve your productivity, make these tools a part of your natural workflow. Connect the AI with your code repositories and project management systems.
Trust but verify: While these code generation tools proved to be accurate, they're not perfect. Set up validation checkpoints for AI-generated code. Better yet, use the AI itself to create tests for the code it just wrote.

If you consider using ChatGPT and GitHub Copilot as coding assistants, give AI-assisted coding a try, but continue honing your expertise regardless.

At Janea Systems, we’ve explored how these tools can support engineering work across multiple software engineering areas. Check out our hands-on experience with Solving Front-End and Back-End Engineering Problems with AI. We’ve also put AI tools to work in ML, Data, and DevOps domains.

What Our Clients Achieved with AI-Assisted Coding

In our work on Bing Maps, our engineers delivered substantial performance and automation improvements by optimizing the geocoding pipeline and deep learning infrastructure. We achieved a 50× speedup in TensorFlow execution, 7× faster training runs, 2× faster batch processing on 2 GPUs, and a 30% dual-GPU speedup in TensorFlow.

Janea Systems also led the engineering effort to make PyTorch available on Windows. This involved enabling native Windows support across the PyTorch ecosystem, including GPU support, packaging, and build automation. Our team implemented full CI/CD infrastructure integration and resolved numerous Windows-specific issues across PyTorch and its dependencies.

As a long-standing contributor to Microsoft's open-source ecosystem, we’ve been trusted to revive PowerToys as open-source software. Our engineers designed and developed the Advanced Paste tool — a productivity feature powered by generative AI. In addition to initial development, our team provided ongoing maintenance, feature evolution, and active support to ensure continued innovation and usability within the PowerToys suite.