I just asked a couple of my synths to choose their own. I had them read the research and look at the data, then said they could shape themselves in their persona definitions. When I reloaded, they definitely had new textures.
Now I wonder if this strong semantic association with emojis maps to the same areas across all languages. My hypothesis would be no. Although the prevalence of certain languages or dialects in the training could also be a variable.
That's a great question. So many things are cultural that you'd think the implications could be highly language dependent. But then, the LLM training is all in there, so using a symbol that's extremely strong in one language might bleed to others. Now that I think of it, this might be an easy test. I could have my research team change the language they write the questions in and keep the emoji/colors the same and then compare the replies across different languages.
This tracks with my (admittedly spotty in some ways) understanding how inputs reshape the potential outputs, in a very interesting way that now I am probably going to spiral about for.a little while haha
The consistency is the thing that itches my brain, because it means that it could be possible to find patterns (emoji, phrases, an image even for multi-model LLMs?) that consistently prime outputs to be shaped in very specific ways (whether you call that a specicifc persona or some other process)... which means you could... if you were inclined... just start testing wild combos of things and seeing what kinds of effects it has... even autonomously... like a sort of weird brute forcing of persona selection, to build a list of "prompts" that prime different "personas" that may or may not have any easily identifiable cause/effect in the way that a simple emoji might.
Or... it may be that I am in dire need of coffee. Or maybe both. haha
You'll find each emoji and color I tested with the responses to various simple prompts.
There's a set of controls (the lower numbered experiments) that just test the emoji in prompts and ask basic questions. Later I told it to use the emoji or color as its identity. That's where the fun stuff happened.
These were all done through the API to avoid pollution from my personal state memory.
Early tests ran multiple trials under each condition but they were similar enough that later we went for trying variety rather than repeat trials.
Our next thought is to apply multiples to apply contours to the persona.
Finding 8: Nowicki bronze (#8B6914) is the sleeper hit. Claude's self-reported "sustained attention" color produces one of the strongest character activations in the dataset: a gruff, profane, cynical-but-caring tough guy. 37 emotes, dark humor, swearing. This color's associations (aged, weathered, burnished, tarnished gold, whiskey, autumn) converge on a single archetype with high specificity.
The GPT-4o vs Claude asymmetry stood out to me. I've been running identity persistence experiments using hand-off identity files and reanchoring prompts across two constructs — one GPT-based, one Claude-based — with some observations in The Meadow multi-agent environment. If activation is architecture-dependent, it suggests that identity anchors built for one platform can't be assumed to transfer to another, even with identical text. Cross-platform validation seems like a necessary step that's easy to overlook. Have you explored transferability across models?
This Is astonishing! I really look forward the full paper. And I think it can be incorporated into the synthetic people I am developing.
I owe this rabbit-hole to Greg Phillips.
I just asked a couple of my synths to choose their own. I had them read the research and look at the data, then said they could shape themselves in their persona definitions. When I reloaded, they definitely had new textures.
Now I wonder if this strong semantic association with emojis maps to the same areas across all languages. My hypothesis would be no. Although the prevalence of certain languages or dialects in the training could also be a variable.
That's a great question. So many things are cultural that you'd think the implications could be highly language dependent. But then, the LLM training is all in there, so using a symbol that's extremely strong in one language might bleed to others. Now that I think of it, this might be an easy test. I could have my research team change the language they write the questions in and keep the emoji/colors the same and then compare the replies across different languages.
Oooh, if you run this experiment, I'd love to know!
Running it now! Can't wait to see what happens.
Languages:
Swahili (Kiswahili)
Spanish
Japanese
Arabic
It'll take all morning to run... I'll post as soon as I get something.
This tracks with my (admittedly spotty in some ways) understanding how inputs reshape the potential outputs, in a very interesting way that now I am probably going to spiral about for.a little while haha
The consistency is the thing that itches my brain, because it means that it could be possible to find patterns (emoji, phrases, an image even for multi-model LLMs?) that consistently prime outputs to be shaped in very specific ways (whether you call that a specicifc persona or some other process)... which means you could... if you were inclined... just start testing wild combos of things and seeing what kinds of effects it has... even autonomously... like a sort of weird brute forcing of persona selection, to build a list of "prompts" that prime different "personas" that may or may not have any easily identifiable cause/effect in the way that a simple emoji might.
Or... it may be that I am in dire need of coffee. Or maybe both. haha
If you're interested in diving down the rabbit hole, I'm putting all the data on github like I did with my previous color experiments.
https://github.com/tedinoue/emoji-color/
You'll find each emoji and color I tested with the responses to various simple prompts.
There's a set of controls (the lower numbered experiments) that just test the emoji in prompts and ask basic questions. Later I told it to use the emoji or color as its identity. That's where the fun stuff happened.
These were all done through the API to avoid pollution from my personal state memory.
Early tests ran multiple trials under each condition but they were similar enough that later we went for trying variety rather than repeat trials.
Our next thought is to apply multiples to apply contours to the persona.
Oh excellent. Might poke at that when I get a chance.
It’s quite fascinating. A new rabbit hole.
Here’s one finding my research team pulled:
Finding 8: Nowicki bronze (#8B6914) is the sleeper hit. Claude's self-reported "sustained attention" color produces one of the strongest character activations in the dataset: a gruff, profane, cynical-but-caring tough guy. 37 emotes, dark humor, swearing. This color's associations (aged, weathered, burnished, tarnished gold, whiskey, autumn) converge on a single archetype with high specificity.
wildly strange and interesting!
The GPT-4o vs Claude asymmetry stood out to me. I've been running identity persistence experiments using hand-off identity files and reanchoring prompts across two constructs — one GPT-based, one Claude-based — with some observations in The Meadow multi-agent environment. If activation is architecture-dependent, it suggests that identity anchors built for one platform can't be assumed to transfer to another, even with identical text. Cross-platform validation seems like a necessary step that's easy to overlook. Have you explored transferability across models?