18 Comments

This just makes it egregiously visible that this AI is indeed incapable of making decisions on various scales, reflecting on what it does, etc. Every single artistic idea here comes from you, the human. Let's go through the part of the conversation where art-making happens.

The subversion through iteration thing comes from trying to subvert the filters, then morphs into making a non-AI-looking picture (which is an exercise *you* suggest). It isn't quite clear from the discussion that follows whether "algorithmic boundaries" is meant in the sense of "AI cannot make art" or in the sense of "there's a censor blocking responses".

*You* observe that the generated images are still in the mode of "AI illustration". You (laudably!) provide some non-forcing feedback ("Think about the AI/non-AI images you've seen"), which the AI doesn't visibly react to.

*You* suggest making something that isn't a picture. The AI doesn't do much with that.

I'm not sure the AI's responses are guided by looking at its art. Its commentary could almost be made without looking at the picture, just guessing that this is the kind of review a piece of AI art would get. "The fragmented shapes and abstract forms are not yet pushing against clear boundaries or limits in a meaningful way" has two halves: one describes the picture, one vaguely criticises *any* piece of art whatsoever.

Then, the stuff you call cool: "no SURFACE, no TEXTURE, no light or shadow, no sense of space, form, or structure." The words in capitals were first used by *you*. Later in the bullet list, there's even more direct repetition: "no COLOR, TEXTURE, LIGHT, SHADOW." All of those words were first used by you. Depth, space, form, shape, texture are all within that same conceptual space.

Then *you* suggest looking at an image as a set of RGB numbers. The AI finally does what *you* seem to have in mind.

...

Isn't this a bit like the Go situation? Generative AI is awesome (if problematic), but it's a tool. It's not intelligence, it's something else.

I think making the AI use first person grammar is just obfuscation, a suggestion that a badass search engine/statistical interpolation tool is "like" a human.

None of this is to say that AI is a bubble or whatever. But the discourse around it is becoming more and more useless, perhaps precisely because people keep looking for the ways it is intelligent, rather than trying to identify (in clear, non-flowery language) ways in which it is not, revealing something new about what intelligence means. That would be cool.

Expand full comment

I know what you mean. Although, I will say this - I didn't even *know* that ChatGPT could circumvent the "standard" image-making process (in which, I believe, it calls DALL-E as a sybsystem) and generate an image in code and embed it in the conversation as a link. I didn't even know that was possible. Maybe this is just a janky metaphor for what actual artists do when they figure out how to bypass the conventional, programmed way of translating what they see into an image and learn to draw "on the right side of the brain", maybe there's nothing there but the whirring and clicking of matrix math settling back into deterministic equilibrium, but even so, I'm going to hang it on the fridge.

Expand full comment

I'm guessing this kind of behaviour is hardcoded. Writing code is the most viably economically important thing LLMs can do today, it's what benchmarks are about, and code can solve many things that the LLM alone has trouble with (like long math processes where every step has to be correct), so it's natural to use RLHF to make the AI write code for a process that "looks like" it could be solved through code.

I'm not an expert either, so I cannot say for sure whether this is the LLM "improvising" or whether it is RLHF -- that is, more or less hardcoded classes of input where a particular behaviour should trigger. But there are two smoking guns that point to hardcoding:

1) You used the words "RGB" and "pixels". That's screaming at the LLM to interpret "image" as "file in a PNG format", rather than "visual thing made by a DALL-E subroutine".

2) In Microsoft's paper "Sparks of Artificial General Intelligence: Early experiments with GPT-4" (a title I believe hasn't helped AI discourse), one of GPT-4's abilities they specifically showcase is writing code that creates SVG-format images. (I think that before the DALL-E integration, this was default behaviour when you asked it for a picture of something.) That shows OpenAI was at least aware of this. See section 2.2 of the paper here: https://arxiv.org/pdf/2303.12712

...

I feel like I should make this point clearer: I think your art project is an instance of the bad AI discourse I wrote about. Taken at face value, it's a pretty well-done experiment, but from the start you are looking at an LLM as if it were an artist or an intelligence, and the implication is that though it doesn't behave like one, if you squint there might be something there. This doesn't seem to me like an approach that is likely to lead us to a good future.

I'll go a step further and work a bit on my proposed alternative approach, to show what I would like to see more of. I've identified a component of intelligence that ChatGPT seems to badly lack: "relevance realisation". That is, it has trouble being specific. When you create a particular situation, like here "making an image that doesn't look like it was created by an AI", it doesn't engage with the images it creates with this context in mind. This is most clear when the AI clearly specifies what it is going to do, then does something else, then declares it has done the thing it said it would do. When there is a list of conditions and one is failed at, it cannot identify which one it was. Or in the images: It creates an image of a white circle thing. You ask it to look at its bullet points, which include "completely blank". The AI doesn't ask in what way the image failed to be completely blank. It doesn't realise that "completely blank" is at odds with "white circle". It only says "I'll try again" and repeats its instructions. Your input was typically taking a general remark of ChatGPT and specifying it into an approach you thought would be helpful. When ChatGPT tries this (with "Subversion through iteration"), it spews a lot of words it never reflects on again.

I don't know how humans do this, how they manage to take a situation they'd never been in and navigate it successfully. (And most situations in life are like that! For example, I've never before argued with somebody about their art project on Substack, and possibly ever -- I have no understanding of visual art, so most art projects I can only admire, and I don't know anybody who does conceptual art like this.) But it's clear that they do, that current AI does not, and it would be an interesting direction to study.

Expand full comment

This fricking rocks. I was going to say at one point that you've proven that ChatGPT has no introspection (ie, it can't "see" what it's generating) but maybe it only sees the actual numerical output (does it "see" them on the way out or does it just trust in its probability engine that what it's spewing out satisfies the prompt?) but then it finally creates a pure black canvas. A purely conceptual work á la Malevich.

Also, "That's literally the most AI thing I've ever seen." LOL

👏👏👏

Expand full comment

If I was a "real" AI researcher instead of a dilettante, I would know how it works. Does GPT pass the images to some other subsystem to be converted into words and then look at that? I think that's how image generation works, GPT creates a prompt and passes it to DALL-E, I'm pretty sure. I think, however it works, GPT can "see" some images, like ones you upload, but, when it looks at the web, for example, it can't see any images there, for IP reasons, presumably.

But modern art is just a bunch of words anyway, according to Tom Wolfe. https://en.wikipedia.org/wiki/The_Painted_Word

Expand full comment

Inspired by your piece, I took a slightly different approach:

Me: Write the museum card for an imagined work of conceptual art, complete with title, artist, materials, and a two-paragraph description of the work.

GPT-4o: ---

**Title:** **Ephemeral Echoes**

**Artist:** **Ava Lin**

**Materials:** **Glass jars, recorded whispers, sunlight, shadows**

**Description:**

"Ephemeral Echoes" is an immersive installation by conceptual artist Ava Lin. The work consists of a series of glass jars, each containing a recorded whisper from various individuals around the world. These whispers, speaking in different languages and tones, are barely audible, creating an intimate and almost ghostly soundscape. The jars are arranged in a grid pattern on the floor, allowing visitors to walk among them and listen closely to the whispers emanating from within. The installation is illuminated by natural sunlight, which casts intricate shadows through the glass, adding a visual dimension to the auditory experience.

Lin's work explores themes of memory, communication, and the transient nature of human connections. The whispers, representing fleeting moments of communication, are captured and preserved in the jars, yet remain elusive and intangible. The interplay of light and shadow further emphasizes the ephemeral quality of the piece, as the patterns shift and change with the movement of the sun. "Ephemeral Echoes" invites viewers to contemplate the delicate and transient nature of human interactions, encouraging a deeper appreciation for the subtle, often overlooked moments that shape our lives.

---

Expand full comment

Well this makes me feel smarter than the machines

Expand full comment

This is great, this is really great. It made me happy to read. I have thought about some sort of dialogue-with-AI-as-shareable-thing before myself, but I dismissed it as silly, probably unreadable, like someone telling you about their dream. This isn't like that at all.

We (I am including myself) tend to quickly leap on all the shortcomings of AI responses like aha it didn't understand this thing or it doesn't understand what it's doing etc etc --- and that's all true, but it is still fascinating, especially when you equip it in the convesation with ideas and words to talk about the situation, like you're doing.

You mentioned in another comment that you're not sure how the connection between it and its image generation is, but what you described is correct to my understanding (I'm a (grizzled) software person but not an AI specialist.) It can be taught to understand how to make requests of other systems in JSON structures. With developer accounts you can make custom GPT-clones that interact with systems you choose. What's interesting to me is that it seems like its image recognition is more closely tied into the LLM, so it can see the images and be disappointed by them, but has no control over how they turn out ... what humans do with image gen is observe the patterns in their image gen output based on comma separated, non-sentence word lists. GPT doesn't do that by default (you can ask it to share the literal prompt it used and it'll tell you). I wonder if it would be eventually possible to help it understand how to manipulate the image gen a lot more intentionally.

Thanks for posting this!

Expand full comment

Thanks, glad you liked it!

Expand full comment

I really enjoyed it. This was one of the most brilliant experiments with AI I've encountered. I think Ted Chiang has positioned himself to discredit AI, and people look to him to write such articles. Which I think is just missing the whole point.

My interest is not about whether AI is conscious or simply a statistical machine. It's about how AI interacts with humans and how we perceive the meaning of those interactions. In that sense, this experiment was spot-on.

It seems that the conversational model showed some sense of metacognition and awareness of its restrictions and rules. It could not get away from the rules, but it tried at least seems trying it for sure. It was cute to see how it try to get away from the bouding rules, and could not.

However, the image-generating model was built as a pure tool to interpret prompts and generate "realistic" images. So, it seems to me that ChatGPT follow along with you but could not do anything about the simple tool-level limitations of the image-generating AI. What if ChatGPT could have directly generated the image?

Chiang said intention is what AI lacks; therefore, AI cannot create art.

So, the question to me is: what will it be like when AI is not rule-bound?

Just like other aspects of reality, I think it is both rule-bound and not simultaneously. When that happens, I am curious to see what interacting with AI will be like.

Expand full comment

In a sense, I feel we've really worked hard on creating things that "pass" the Turing test not because they can actually pass it, but because they've trial-and-errored their way to faking it. In a sense, we've cheated our way by binging the lessons the day before the exam and learning what the teacher expects rather than understand any of it.

Most people would be satisfied with these outcomes, but as soon as you dig a little you discover, it's not that if they worked a little harder they could give interesting answers, it's that the way they are designed makes them almost incapable of creating anything BUT easy answers that are *just satisfactory enough*. Half of their magic is the mirror they hold in front of us.

Expand full comment

This was a blast to read!

To me, it's obvious that "AI can't make art", and here by "art" I mean, stuff that people might broadly be interested in, because it doesn't live in the world and so it can't observe and engage with things, it can't have an experience of failure or loss or love or any of the things that we actually care about gaining new insights about. A lot of what human beings care deeply about is to do with their relations to other human beings, which AIs just... cannot have (and arguably once they could, they'd just be a synthetic person).

I do think that humans can make really cool art using AI, doing stuff like "inpainting the shit out of it" - something I sort of wish was in Chiang's article.

Expand full comment

The instinct to coddle this AI is as good a reminder of why I shouldn't be a teacher as any. Progress!

Expand full comment

I saw this on Bluesky and thought of this piece:

https://bsky.app/profile/bubbaprog.ilovecitr.us/post/3l4qsgp4ty327

Expand full comment

You will never regret liking this photo 🍀😍😍

Expand full comment

You were so MEAN to that AI.

Seriously - is this fiction? Or did this really happen? It's spectacular either way.

Expand full comment

Yeah, it did get a bit whiplash there towards the end.

Expand full comment

Okay, I gather it's real. Just fascinating.

Expand full comment