Assessment of AI Assisted Programming

AI Assisted Programming

It all seemed to start with GitHub’s Co-pilot; now, there are many options. Our views of this innovative technology come from one with multiple decades working professionally in programming, with a life-long passion for working in the development of programming tooling and prosecuting the quality of software engineering in large multinational engineering organisations.

The journey of programming assistance

Programming has always benefited from assistance from (i) humans and processes, and (ii) tooling.

People, Process, and Technology

Humans and Processes

IBM research showed that (human) software reviews were more effective at finding bugs than dynamic testing. Extreme Programming methodology championed the pair-programming model. All effective software engineering departments include (human) code reviews. My observation is that participation in these all significantly benefits the skills of programmers. Always get expert reviews where you can. The caveat, of course, lies in the quality of the review – both the review process and the expertise of those involved.

Tooling

This has seen the greatest innovation in my lifetime and incorporates (i) the evolution of programming languages and (ii) the tooling available for them.

Programming Language Evolution Timeline {: style=”width:50%; height:auto;” }

Programming Language Evolution

Languages evolved from assemblers, meaning not having to code directly in binary, to programming flow control being incorporated into the language itself (Algol, Pascal, C), to modules/packages being first-class citizens in coding (Modula), and object orientation facilities adoption (C++, Ruby). This progressed to other facilities, such as message passing (Closure), associative arrays (Awk), generics (), …. and more besides. Programming languages became more and more expressive. So much so that when new languages came to the fore, adopters would campaign for their favourite facilities to incorporate their newly adopted language.

With great power comes great responsibility.

Voltaire (and, later, Spiderman)

Additional facilities sometimes brought with it, defect attractors, such as multiple inheritance, which has been shown to be a defect attractor, and so is often deprecated in languages adopting object orientation.

But these are all language facilities;

One arm of programming language evolution is that of, ‘C’, considered by many as the epitome of ‘mid-level’ programming languages, one that provides sufficient, well-considered constructs over assemblers without embracing facilities that bring with them risks of performance degradation or complexities that detract from their minimalists ethos. During all that has happened since ‘C’ was born, it has remained the most popular systems-level programming language and often the only mandated programming language in industries such as embedded systems. Indeed, the standardised ANSI C, by some, is considered too complex, and reductions of it is popularised in Motor Industry Software Reliability Association (MISRA) C, a reduced instruction set of C.

Programming Tooling Evolution

As programming languages diversified to champion more and more language paradigms, the tooling for them centred around –

Translation

Assemblers
Compilers
Interpreters

Dynamic Analysis and Testing

Symbolic Debuggers
Performance Analysers
Unit Test Frameworks

Software Creation/Editing

Lints (static analysis), provide a greater range of warnings and insights into the code, even when it compiles OK.
Syntax-directed editors reveal the syntax as we code and can auto-indent and highlight syntax errors.
Variable Name auto-completions editors complete variable names from catalogues they construct by analysing our then code base.
Source-code documentation tools, provide a means to better document our code.
AI Coding assistance is generative assistance as we code.

What’s going on with AI Coding Assistance?

Robot-assisted programming

Coding assistants read and analyse our code and can provide anticipated completions by prompting them before we type. We can then accept them instead of having to type ourselves. This can be considered a more advanced form of variable name auto-completion (as provided by the Apple Xcode editor, say). This reveals a deeper analysis of our source code than mere syntax and variable name catalogues. Both of which are well-established components of all language translators.

Consider LLMs – This technology leverages the mass of online writing and generates responses derived from the pre-existing content through generative mechanisms. The generative algorithms could be considered a rudimentary type of intelligence, capable of changing pre-existing content to requirements laid out in the prompts. It is impressive until you encounter hallucinations. At this point, the limitations of the approach are evident. However, getting the tooling to work to one’s benefit is worthwhile. But how far can we take this before the problems with hallucinations outweigh the benefits? What is the best way to benefit from the technology?

Short-form and Long-form Assistance

Codeium

Auto-completions seem to be a clear win when working with AI coding assistance. We use codeium with Vim, for which very little setup and hardly any ongoing maintenance is required. For this we get a lot of time-saving assistance at no cost and some loss of privacy.

Long-form

What about asking an LLM-assisted IDE to code an entire project for you, interacting with it over the entire term of the development? You might have seen the demonstrations on YouTube. Here we give the LLM the view of our entire project and prompt it for changes and fixes to the entire code base. What of this? How practical is it?

Windsurf Editor

Our experience is using the Windsurf IDE, with its embedded AI assistance, Cascade, using the current best-of-breed programming LLM, Claude 3.5 Sonnet. Here is the prompt that we started with,

Create a single-page vue.js progressive web app as a streaming client of tracks, which are audio files, each with associated image files and metadata. Includes an audio player for the streamed audio file and a browser for the image, which needs zoom and pan facilities. Allow for the creation and editing of setlists, which are ordered collections of the tracks.

We have no previous experience with the Vue.js framework. This quickly (10 mins) created an app, and after less than 2 hours of fixes, we had an app working close to that initial specification. That is a long way from where we eventually want this app to go, but to adopt an entirely new framework and tooling and get a working prototype quickly is a major win. This is our first experience in this long-form creation of code using LLMs.

We discovered that getting the right versions of the various software dependencies (JS, node, npm, vue.js) was critical. The LLM undoubtedly was leveraging demonstration programs previously published on the internet and these would have been for the then versions of toolings and, probably, identified in the journal/blog. It pays off to anticipate what the LLM could be misunderstanding or not taking into consideration and prompting it on those points, just as it would with a real-world, human coding assistant.

We committed this as a Git repo and then cloned it on another computer to carry on development there. We had some issues with the toolchain, again establishing the required versions of node, npm and vue.js. There was a misunderstanding regarding yarn being used instead of npm for building. Once it was built and ran OK, we asked it for some additions to start to fill out the functionality we are looking for.

This is where we seemed to hit a wall.

At this point, we had taken the approach that the LLM owned the code base, and our function was to ask for more functions and provide a sort of coding tester facility where we reported back all the bugs.