The six ages of programming

Coyote in Minnesota

One of the reasons I love the time between Christmas and New Year is that work is quiet and I can experiment. This year, I continued to work on my DuckDB RDF extension, adding support for parsing RDF/XML. There’s really no need for RDF/XML support, hardly anyone uses it any more, largely because it’s such a bear to work with: overly complex with all sorts of largely irrelevant edge cases. (For example, a full XML literal in the object).

But that wasn’t going to stop me! Once I had a basic parser integrated into DuckDB (more of that in a moment), my methodology was simple: create a set of unit tests using all the example documents from the W3C spec: Each unit test would parse the xml and ntriple examples, pass them to DuckDB and ask it to do a union on the two. If the six columns returned were an exact match then the count in the union should be the same as the count in the nTriples. Side note, I used DuckDB parameterized queries for this which worked really well. Given a query like this:

prepare determine_delta as   
select count(a.*)-count(b.*) as delta   
from (select subject, predicate, object, object_datatype, upper(object_lang)    
      from read_rdf($testFile||'.nt')   
      union  
      select subject, predicate, object, object_datatype, upper(object_lang)    
      from read_rdf($testFile||'.rdf')) as a,   
      select count(*) from read_rdf($testFile||'.nt')) as b;

Then a single test is written 

execute determine_delta(testFile := 'test/xmlrdf/example07');

that passes if it returns 0 (e.g. the union of the two tables - one of the tables should be zero rows as they exactly match)

Having created the tests, I had Gemini write the code using libxml2 to SAX parse the XML documents. As each test failed I gave Gemini the example it failed on and asked it to improve the implementation. I find the mix of deterministic testing combined with probabilistic code generation to be a great match: keep iterating until the tests pass.

While I worked on and off on this, I found myself reflecting on the fact that Gen AI enabled software development is really the next leap in developer productivity.  I started my engineering career in the PC era, pre Internet. You would look up how the SDK worked in a book, most libraries you used were purchased from vendors and were delivered via a mailed floppy disk. Support was largely figuring it out on your own, or if you were in an a team, summoning up the courage to ask a senior peer to help.

By the mid Nineties, those with an Internet connection were asking each other questions in comp.sci.* newsgroups which became even more powerful when Google started indexing them.  There were also dedicated websites, for instance if you used Oracle you used Ask Tom (which I see still exists!)

The point being that once you had an Internet connection and access to search your productivity radically changed. The next unlock was Open Source libraries: we went from purchasing (or writing) libraries to do what you needed to finding and integrating them. The innovations have continued, each time giving us enhanced productivity.  

Taking a step back, it feels to me like we’re entering the sixth age of software development, with the ages characterized by ChatGPT and I as:

Name Key Innovation Productivity Unlock
1. Punch Cards 50s-60s Batch processing Compilers replace raw machine code. Glimpses of code portability.
2. Interactive Terminals 60s-70s Time-sharing Instant feedback replaces batch cycles. Languages standardize, much more portability of code.
3. PCs 80s- 90s Local IDEs Full local dev stack, configuration management tooling starts to appear
4. Internet Knowledge 2000-2010 Web, Stack Overflow Access to everyone’s experiences , rise of sites like StackOverflow. Documentation available via search.
5a. Open Source Reuse 2010-present Package ecosystems “Import instead of build”. Significant re-use of open source libraries.
5b. Cloud & DevOps 2010-present Automation + cloud infra Continuous delivery & reproducibility: no waiting for equipment to arrive, find out immediately if code is unsafe for production.
6. AI-Augmented now LLMs, copilots Code generation + reasoning