Artificial Effluent

Published by Dominic

February 27th, 2023

A lot of the Discourse around ChatGPT has focused on the question of "what if it works?". As is often the case with technology, though, it's at least as important to ask the question of "what if it doesn't work — but people use it anyway?".

ChatGPT has a failure mode where it "hallucinates" things that do not exist. Here are just a few examples of things it made up from whole cloth: links on websites, entire academic papers, software for download, and a phone lookup service. These "hallucinations" are nothing like the sorts of hallucinations that a human might experience, perhaps after eating some particularly exciting cheese, or maybe a handful of mushrooms. Instead, these fabrications are inherent in the nature of the language models as stochastic parrots: they don't actually have any conception of the nature of the reality they appear to describe. They are simply producing coherent text which resembles text they have seen before. If this process results in superficially plausible-seeming descriptions of things that do not exist and have never existed, that is a problem for the user.

Of course that user may be trying to generate fictional descriptions, but with the goal of passsing off ChatGPT's creations as their own. Unfortunately "democratising the means of production" in this way triggers a race to the bottom, to the point that the sheer volume of AI-generated submissions spam forced venerable SF publisher Clarkesworld to shut down — temporarily, one hopes. None of the submitted material seems to have been any good, but all of it had to be opened and dealt with. And it's not just Clarkesworld being spammed with low-quality submissions, either: it's endemic:

The people doing this by and large don’t have any real concept of how to tell a story, and neither do any kind of A.I. You don’t have to finish the first sentence to know it’s not going to be a readable story.

Even now while the AI-generated submissions are very obvious, the process of weeding them out still takes time, and the problem will only get worse as newer generations of the models are able to produce more prima facie convincing fakes.

The question of whether AI-produced fiction that is indistinguishable from human-created fiction is still ipso facto bad is somewhat interesting philosophically, but that is not what is going on here: the purported authors of these pieces are not disclosing that they are at best "prompt engineers", or glorified "ideas guys". They want the kudos of being recognised as authors, without any of the hard work:

the people submitting chatbot-generated stories appeared to be spamming magazines that pay for fiction.

I might still quibble with the need for a story-writing bot when actual human writers are struggling to keep a roof overhead, but we are as yet some way from the point where the two can be mistaken for each other. The people submitting AI-generated fiction to these journals are pure grifters, hoping to turn a quick buck from a few minutes' work in ChatGPT, and taking space and money from actual authors in the process.¹

Ted Chiang made an important prediction in his widely-circulated blurry JPEGs article:

But I’m going to make a prediction: when assembling the vast amount of text used to train GPT-4, the people at OpenAI will have made every effort to exclude material generated by ChatGPT or any other large language model. If this turns out to be the case, it will serve as unintentional confirmation that the analogy between large language models and lossy compression is useful. Repeatedly resaving a jpeg creates more compression artifacts, because more information is lost every time.

This is indeed going to be a problem for GPT-4, -5, -6, and so on: where will they find a pool of data that is not polluted with the effluent of their predecessors? And yes, I know OpenAI is supposedly working on ways to detect their own output, but we all know that is just going to be a game of cat and mouse, with new methods of detection always trailing the new methods of evasion and obfuscation.

To be sure, there are many legitimate uses for this technology (although I still don't want it in my search box). The key to most of them is that there is a moment for review by a competent and motivated human built in to the process. The real failure for all of the examples above is not that the language model made something up that might or perhaps even should exist; that's built in. The problem is that human users were taken in by its authoritative tone and acted on the faulty information.

My concern is specifically that, in the post-ChatGPT rush for everyone to show that they are doing something — anything — with AI, doors will be opened to all sorts of negative consequences. These could be active abuses, such as impersonation, or passive ones, omitting safeguards that would prevent users from being taken in by machine hallucinations.

Both of these cases are abusive, and unlike purely technical shortcomings, it is far from being a given that these abuse vectors will be addressed at all, let alone simply by the inexorable march of technological progress. Indeed, one suspects that to the creators of ChatGPT, a successful submission to a fiction journal would be seen as a win, rather than the indictment of their entire model that it is. And that is the real problem: it is still far from clear what the endgame is for the creators of this technology, nor what (or whom) they might be willing to sacrifice along the way.

🖼️ Photo by Possessed Photography on Unsplash

It's probably inevitable that LLM-produced fiction will appear sooner rather than later. My money is on the big corporate-owned shared universes. Who will care if the next Star Wars tie-in novel is written by a bot? As long as it is consistent with canon and doesn't include too many women or minorities, most fans will be just fine with a couple of hundred pages of extruded fiction product. ↩

Ai, Machine Learning, Artificial Intelligence, Gpt, Chatgpt

Good Robot

Published by Dominic

January 17th, 2023

Permalink

Last time I wrote about ChatGPT, I was pretty negative. Was I too harsh?

The reason I was so negative is that many of the early demos of ChatGPT focus on feats that are technically impressive ("write me a story about a spaceman in the style of Faulkner" or whatever), but whose actual application is at best unclear. What, after all, is the business model? Who will pay for a somewhat stilted story written by a bot, at least once the novelty value wears off? Actual human writers are, by and large, not exactly rolling in piles of dollars, so it's not as if there is a huge profit opportunity awaiting the first clever disrupter — quite apart from the moral consequences of putting a bunch of humans out of a job, even an ill-paying one.

Instead, I wanted to think about some more useful and positive applications of this technology, ones which also have the advantage that they are either not being done at all today, or can only be done at vast expense and not at scale or in real time. Bonus points if they avoid being actively abusive or enabling ridiculous grifts and rent-seeking. After all, with Microsoft putting increasing weight behind Open AI, it's obvious that smart people smell money here somewhere.

web3 enters the AI scene pic.twitter.com/vkcyvl6rlm
— kache (yacine) (KING OF DING) (@yacineMTB) January 11, 2023

Summarise Information (B2C)

It's more or less mandatory for new technology to come with a link to some beloved piece of SF. For once, this is not a Torment Nexus-style dystopia. Instead, I'm going right to the source, with Papa Bill's Neuromancer:

"Panther Moderns," he said to the Hosaka, removing the trodes. "Five minute precis."
"Ready," the computer said.

Here's a service that everyone wants, as evidenced by the success of the "five-minute explainer" format. Something hits your personal filter bubble, and you can tell there is a lot of back story; battle lines are already drawn up, people are several levels deep into their feuds and meta-positioning, and all you want is a quick recap. Just the facts, ma'am, all sorts of multimedia, with a unifying voiceover, and no more than five minutes.

There are also more business-oriented use cases for this sort of meta-textual analysis, such as "compare this quarter's results with last quarter's and YoY, with trend lines based on close competitors and on the wider sector". You could even link with Midjourney or Stable Diffusion to graph the results (without having to do all the laborious cutting and pasting to get the relevant numbers into a table first, and making sure they use the same units, currencies, and time periods).

Smarter Assistants (B2C)

One of the complaints that people have about voice assistants is that they appear to have all the contextual awareness of goldfish. Sure, you can go to a certain amount of effort to get Siri, Alexa, and their ilk to understand "my wife" without having to use the long-suffering woman's full name and surname on each invocation, but they still have all the continuity of an amnesiac hamster if you try to continue a conversation after the first interaction. Seriously, babies have a far better idea of object persistence (peekaboo!). The robots simply have no way of keeping context between statements, outside of a few hard-coded showcase examples.

Instead, what we want is precisely that continuity: asking for appointments, being read a list, and then asking to "move the first one to after my gym class, but leave me enough time to shower and get over there". This is the sort of use case that explains why Microsoft is investing so heavily here: they are so far behind otherwise that why not? Supposedly Google has had this tech for a while and just couldn't figure out a way to introduce it without disrupting its cash-cow search business. And Apple never talks about future product directions until they are ready to launch (with the weird exception of Project Titan, of course), so it may be that they are already on top of this one. Certainly it was almost suspicious how quickly Apple trotted out specific support for Stable Diffusion.

Tier Zero Support (B2B)

Back in the day, I used to work in tech support. The classic division of labour in that world goes something like this:

Tier One, aka "the phone firewall": people who answer telephone or email queries directly. Most questions should be solved at this level.
Tier Two: these are more expert people, who can help with problems which cannot be resolved quickly at Tier One. Usually customers can’t contact Tier Two directly; their issues have to be escalated there. You don't want too many issues to get to this level, because it gets expensive.
Tier Three: in software organisations, these are usually the actual engineers working on the product. If you get to Tier Three, your problem is so structural, or your enhancement request is sufficiently critical, that it's no longer a question of helping you to do something or fixing an issue, but changing the actual functioning of the product in a pretty major way.

Obviously, there are increasing costs at each level. A problem getting escalated to Tier Two means burning the time of more senior and expert employees, who are ipso facto more expensive. Getting to Tier Three not only compounds the monetary cost, but also adds opportunity costs: what else are those engineers not doing, while they work on this issue? Therefore, tech support is all about making sure problems get solved at the lowest possible tier of the organisation. This focus has the happy side-effect of addressing the issue faster, and with fewer communications round-trips, which makes users happier too.

It's a classic win-win scenario — so why not make it even better? That's what the Powers That Be decided to do where I was. They added a "Tier Zero" of support, that was outsourced (to humans), with the idea that they would address the huge proportion of queries that could be answered simply by referring to the knowledge base¹.

So how did this go? Well, it was such a disaster that my notoriously tight-fisted employers² ended up paying to get out of the contract early. But could AI do better?

In theory, this is not a terrible idea. Something like ChatGPT should be able to answer questions based on a specific knowledge base, including past interactions with the bot. Feed it product docs, FAQs, and forum posts, and you get a reasonable approximation of a junior support engineer. Just make sure you have a way for a user to pull the rip-cord and get escalated to a human engineer when the bot gets stuck, and why not?

One word of caution: the way I moved out of tech support is that I would not only answer the immediate question from a customer, but I would go find the account manager afterwards and tell them their customer needed consulting, or training, or more licenses, or whatever it was. AI might not have the initiative to do that.

Another drawback: it's hard enough to give advice in a technical context, but at least there, a command will either execute or not; it will give the expected results, or not (and even then, there may be subtle bugs that only manifest over time). Some have already seized on other domains that feature lots of repetive text as opportunities for ChatGPT. Examples include legal contracts, and tax or medical advice — but what about plausible-but-wrong answers? If your chatbot tells me to cure my cancer with cleanses and raw vegetables, can I (or my estate) sue you for medical malpractice? If your investor agreement includes a logic bug that exposes you to unlimited liability, do you have the right to refuse to pay out? Fun times ahead for all concerned.

Formulaic Text (B2B)

Another idea for automated text generation is to come up with infinite variations on known original text. In plain language, I am talking about A/B testing website copy in real time, rewriting it over and over to entice users to stick around, interact, and with any luck, generate revenue for the website operators.

Taken to the extreme, you get the evil version, tied in with adtech surveillance to tweak the text for each individual visitor, such that nobody ever sees the same website as anyone else. Great for plausible deniability, too, naturally: "of course we would never encourage self-harm — but maybe our bot responded to something in the user's own profile…".

This is the promise of personalised advertising, that is tweaked to be specifically relevant to each individual user. I am and remain sceptical of the data-driven approach to advertising; the most potent targeted ads that I see are the same examples of brand advertising that would have worked equally well a hundred years ago. I read Monocle, I see an ad for socks, I want those socks. You show me a pop-up ad for socks while I am trying to read something unrelated, I dismiss it so fast that I don't even register that it's trying to sell me socks. It's not clear to me that increasing smarts behind the adtech will change the parameters of that equation significantly.

Scary LLM exercise: go down list of FBI common scams (https://t.co/1ga8z7DsoB) and imagine how each of these becomes drastically more dangerous with the power of LLMs. We're spending too much time worrying about whether LLMs are factual and not enough about this.

CC @ojoshe
— Osman (Ozzie) Osman (@oao84) January 12, 2023

De-valuing Human Labour

We provided mental health support to about 4,000 people — using GPT-3. Here’s what happened 👇
— Rob Morris (@RobertRMorris) January 6, 2023

These are the use cases that seem to me to be plausible and defensible. There will be others that have a shorter shelf life, as illustrated in Market For Lemons:

What happens when every online open lobby multiplayer game is choked with cheaters who all play at superhuman levels in increasingly undetectable ways?

What happens when, from the perspective of the average guy, "every girl" on every dating app is a fiction driven by an AI who strings him along (including sending original and persona-consistent pictures) until it's time to scam money out of him?

What happens when, from the perspective of the average girl, "every guy" on the internet has become weirdly dismissive and hostile, because he's been conditioned to think that any girl that seems interested in him must be fake and trying to scam money out of him?

What happens when comments sections on every forum gets filled with implausibly large consensus-building hordes who are able to adapt in real time and carefully slip their brigading just below the moderator's rules?

What these AI-enabled "growth hacks" boil down to is taking advantage of a market that has already outsourced labour and creativity to (human) non-employees: multiplayer games, user-generated content, and social media in general. Instead of coming up with a storyline for your game, why not just make users pay to play with each other? Instead of paying writers, photographers, and video makers, why not just let them upload their content for free? And with social media, why not just enable users to live vicariously through the fantasy lives of others, while shilling them products that promise to let them join in?

Now computers can deliver against those savings even better — but only for a short while, until people get bored of dealing with poor imitations of fellow humans. We old farts already bailed on multiplayer games, because it's no fun spending my weekly hour of gaming just getting ganked repeatedly by some twelve-year-old who plays all day. Increasingly, I bailed on UGC networks: there is far more quantity than quality, and I would rather pay for a small amount of quality than have to sift through the endless quantity.

If the pre-teen players with preternaturally accurate aim are now actually bots, and the AI-enhanced influencers are now actually full-on AIs, those developments are hardly likely to draw me back to the platforms. Any application of AI tech that is simply arbitrage on the cost of humans without factoring in other aspects has a short shelf life.

Taken to its extreme, this trend leads to humans abdicating the web entirely, leaving the field to AIs creating content that will be ranked by other AIs, and with yet more AIs rewarding the next generation of paperclip-maximising content-producing AIs. A bleak future indeed.

So What's Next?

At this point, with the backing of major players like Microsoft and Apple, it seems that AI-enabled products are somewhat inevitable. What we can hope for is that, after some initial over-excitement, we see fewer chatbot psychologists, and more use cases that are concrete, practical, and helpful — to humans.

🖼️ Photos by Andrea De Santis and Charles Deluvio on Unsplash

Also known as RTFM queries, which stands for Read The, ahem, Fine Manual. (We didn't always say "Fine", unless a customer or a manager was listening.) ↩
We had to share rooms on business trips, leaving me with a wealth of stories, none of which I intend to recount in writing. ↩

Ai, Machine Learning, Artificial Intelligence, Gpt, Chatgpt

Information Push

Published by Dominic

December 10th, 2022

Permalink

My Twitter timeline, like most people's, is awash with people trying out the latest bot-pretending-to-be-human thing, ChatGPT. Everyone is getting worked up about what it can and cannot do, or whether the way it does it (speed-reading the whole of the Internet) exposes it to copyright claims, inevitable bias, or simply polluting the source that it drinks from so that its descendants will no longer be able to be trained from a pool of guaranteed human-generated content, unpolluted by bot-created effluent.

I have a different question, namely: why?

Prompt engineer is not a thing.

Stop trying to make it a thing.
— Nathan Benaich (@nathanbenaich) December 6, 2022

We do not currently have a problem of lack of low-quality plausible-seeming information on the Internet; quite the opposite. The problem we have right now is one of too much information, leading to information overload and indigestion. On social media, it has not been possible for years to be a completist (reading every post) or to use a purely linear timeline. We require systems to surface information that is particularly interesting or relevant, whether on an automated algorithmic basis, or by manual curation of lists/circles/spaces/instances.

As is inevitably the case in this fallen world of ours, the solution to one problem inevitably begets new problems, and so it is in this case. Algorithmic personalisation and relevance filtering, whether of a social media timeline or the results of a query, soon leads to the question of: relevant to whom?

Back in the early days of Facebook, if you "liked" the page for your favourite band, you would expect to see their posts in your timeline alerting you of their tour dates or album release. Then Facebook realised that they could charge money for that visibility, so the posts by the band that you had liked would no longer show up in your timeline unless the band paid for them to do so.

In the early days of Google, it was possible to type a query into the search box and get a good result. Then people started gaming the system, triggering an arms race that laid waste to ever greater swathes of the internet as collateral damage.

Keyword stuffing meant that metadata in headers became worthless for cataloguing. Auto-complete will helpfully suggest all sorts of things. Famously, recipes now have to start with long personal essays to be marked as relevant by the all-powerful algorithm. Automated search results have become so bad that people append "reddit" to their queries to take advantage of human curation.

Google employees explain why we haven’t seen ChatGPT like functionality in their products; the cost to serve an AI result is 10x to 100x as high as a regular web search today plus they’re too slow relative to how quick search results must be returned. pic.twitter.com/ixYDq0aI2H
— Dare Obasanjo🐀 (@Carnage4Life) December 9, 2022

This development takes us full circle to the early rivalry between automated search engines like Google and human-curated catalogues like Yahoo's. As the scale of the Internet exploded, human curation could not keep up — but now, it’s the quality problem that is outpacing algorithms' ability to keep up. People no longer write for human audiences, but for robotic ones, in the hope of rising to the surface long enough to take advantage of the fifteen minutes of fame that Andy Warhol promised them.

And the best we can think of is to feed the output of all of this striving back into itself.

We are already losing access to information. We are less and less able to control our information intake, as the combination of adtech and opaque relevance algorithms pushes information to us which others have determined that we should consume. In the other direction, our ability to pull or query information we actually desire is restricted or missing entirely. It is all too easy for the controllers of these systems to enable soft censorship, not by deleting information, but simply by making it unsearchable and therefore unfindable. Harbingers of this approach might be Tumblr's on-again, off-again approach to allowing nudity on that platform, or Huawei phones deleting pictures of protests without the nominal owners of those devices getting any say in the matter.

How do we get out of this mess?

While some are fighting back, like Stack Overflow banning the use of GPT for answers, I am already seeing proposals just to give in and embrace the flood of rubbish information. Instead of trying to prevent students from using ChatGPT to write their homework, the thinking is that we should encourage them to submit their prompts together with the model's output and their own edits and curation of that raw output. Instead of trying to make an Internet that is searchable, we should abandon search entirely and rely on ChatGPT and its ilk to synthesise information for us.

I hate all of these ideas with a passion. I want to go in exactly the opposite direction. I want search boxes to include "I know what I'm doing" mode, with Boolean logic and explicit quote operators that actually work. I do find an algorithmic timeline useful, but I would like to have a (paid) pro mode without trends or ads. And as for homework, simply get the students to talk through their understanding of a topic. When I was in school, the only written tests that required me to write pages of prose were composition exercises; tests of subjects like history involved a verbal examination, in which the teacher would ask me a question and I would be expected to expound on the topic. This approach will remain proof against technological cheating for some while yet.

And once again: why are we building these systems, exactly? People appear to find it amusing to chat to them — but people are very easy to fool. ELIZA could do it without burning millions of dollars of GPU time. There is far more good, valuable text out there already, generated by actual interesting human beings, than I can manage to read. I cannot fathom how anyone can think it a good idea to churn out a whole lot more text that is mediocre and often incorrect — especially because, once again, there is already far too much of that being generated by humans. Automating and accelerating the production of even more textual pablum will not improve life for anyone.

The potential for technological improvement over time is no defence, either. So what if in GPT-4 (or -5 or -6) the text gets somewhat less mediocre and is wrong (or racist) a bit less often? Then what? In what way does the creation and development of GPT improve the lot of humanity? At least Facebook and Google could claim a high ideal (even if neither of them lived up to those ideals, or engaged seriously with their real-world consequences). The entities behind GPT appear to be just as mindless as their creation.

🖼️ Photo by Owen Beard on Unsplash

Ai, Machine Learning, Artificial Intelligence, Gpt, Chatgpt

Find the thread

Showing all posts tagged gpt:

Artificial Effluent

Good Robot

Summarise Information (B2C)

Smarter Assistants (B2C)

Tier Zero Support (B2B)

Formulaic Text (B2B)

De-valuing Human Labour

So What's Next?

Information Push

How do we get out of this mess?

Dominic

Latest Posts

Find the thread

Showing all posts tagged gpt:

Summarise Information (B2C)

Smarter Assistants (B2C)

Tier Zero Support (B2B)

Formulaic Text (B2B)

De-valuing Human Labour

So What's Next?

How do we get out of this mess?

Dominic

Latest Posts

Tag Cloud