I’m sure every interaction with every corporate genAI widget is getting datamined. Not just for performance metrics to improve the genAI itself, but for ad targeting data and so on as well.
I’d be kinda surprised if there aren’t a bunch of people also working on the “put a straighjacket on the madman” approach to genAI interactions, as genAI’s weird failure to “automatically” apply all the specialized knowledge it has at its disposal is one of the first things I noticed years ago when these things first started showing up. And I’m not particularly clever. But yeah, if you asked me to lay money on the question I’d be willing to take very long odds that your work is represented in the current training data. Because I’d be astonished if anybody who has used Claude isn’t represented in the training data.
As for the “spy vs spy” approach (of stacking agents and having them work on different things) that’s probably as common as it is because a lot of problems in programming are NP hard and unless your getAI proves P=NP then there’s no better solution, in general (asterisk, see note), than making guesses and checking them. The note here being that a substantial part of computer science consists of heuristics for rapidly obtaining partial solutions to these kinds of problems, so there’s usually a literature search component as well, whether that involves searching stackexchange or using an LLM (which is probably just a fancy permutation of searching stackexchange).
But after posting I decided to check a couple of other places and as near as I can tell my personal coding style has polluted some meaningful fraction of all AI answers to questions about TADS3. I just asked google’s AI search “How do you implement an infinite object dispenser in TADS3/adv3?” and its response included:
A TADS3 code fragment
#include <adv3.h>
#include <en_us.h>
// 1. Define the class for the item to be dispensed
class Pebble: Resource
'(small) (round) pebble*pebbles' 'pebble'
"A small, round pebble. "
;
// 2. Define the dispenser
+ bucket: Fixture, ResourceFactory
'(infinite) (pebbles) bucket' 'bucket'
"It contains an infinite number of pebbles. "
// The class of objects this factory creates
resourceClass = Pebble
// Allows the factory to accept items back (optional)
resourceReturnable = true
// This method ensures there's always one item "visible" to the parser
initializeFactory() {
if(contains({ x: x.ofKind(resourceClass) })) return;
createResource().moveInto(self);
}
;
// 3. Custom creation logic (if needed)
modify ResourceFactory
createResource() {
return resourceClass.createInstance();
}
;
This is clearly a slightly munged-up version of code I posted in this recent thread. It relies on classes and methods that are not part of stock adv3 but are presented in the thread (but not in google’s response). There is the mitigating factor that because it’s nominally a search result it also links the thread.
The stuff I noticed in Claude was a slightly less dramatic form of this. When it needed to come up with an object to use in an example it provided, verbatin, the pebble declaration I’ve used dozens of times in forum posts here and dozens more as code samples in public github repos.
Not, to be clear, that I’m trying to assert some sort of intellectual property rights over those code examples. I’m pretty sure the pebble example I’ve been using is itself a slightly modified form of an example used by Eric Eve in Learning TADS 3. And it’s just because I’ve posted a lot of TADS3 code and I keep using one specific bit of code as a generic example, and because there’s comparatively little other TADS3 code currently being posted, that specific example seems to be over-represented in the training data. And so when Claude needed to come up with something it didn’t use something like that example, it used precisely that example.
There are other things…like I think my original intent (to ask Claude a question I had previously worked out an answer to so I could check its work) backfired because I’m inclined to believe it wasn’t working out a solution similar to the one I worked out, I think what I was watching was Claude copying my homework in realtime. But that specific example seems to be a pretty clear example of verbatim re-use (in genAI-created code).