Detecting hallucinations in large language models using semantic entropy

254 points by Tomte a year ago

Everyone in the comments seems to be arguing over the semantics of the words and anthropomorphization of LLMs. Putting that aside, there is a real problem with this approach that lies at the mathematical level.

For any given input text, there is a corresponding output text distribution (e.g. the probabilities of all words in a sequence which the model draws samples from).

The approach of drawing several samples and evaluating the entropy and/or disagreement between those draws is that it relies on already knowing the properties of the output distribution. It may be legitimate that one distribution is much more uniformly random than another, which has high certainty. Its not clear to me that they have demonstrated the underlying assumption.

Take for example celebrity info, "What is Tom Cruise known for?". The phrases "movie star", "katie holmes", "topgun", and "scientology" are all quite different in terms of their location in the word vector space, and would result in low semantic similarity, but are all accurate outputs.

On the other hand, "What is Taylor Swift known for?" the answers "standup comedy", "comedian", and "comedy actress" are semantically similar but represent hallucinations. Without knowing the distribution characteristics (e.g multivariate moments and estimates) we couldn't say for certain these are correct merely by their proximity in vector space.

As some have pointed out in this thread, knowing the correct distribution of word sequences for a given input sequence is the very job the LLM is solving, so there is no way of evaluating the output distribution to determine its correctness.

There are actual statistical models to evaluate the amount of uncertainty in output from ANNs (albeit a bit limited), but they are probably not feasible at the scale of LLMs. Perhaps a layer or two could be used to create a partial estimate of uncertainty (e.g. final 2 layers), but this would be a severe truncation of overall network uncertainty.

Another reason I mention this is most hallucinations I encounter are very plausible and often close to the right thing (swapping a variable name, confabulating a config key), which appear very convincing and "in sample", but are actually incorrect.

svnt a year ago

> On the other hand, "What is Taylor Swift known for?" the answers "standup comedy", "comedian", and "comedy actress" are semantically similar but represent hallucinations. Without knowing the distribution characteristics (e.g multivariate moments and estimates) we couldn't say for certain these are correct merely by their proximity in vector space.
It depends on the fact that a high uncertainty answer by definition is less probable. That means if you ask multiple times you will not get the same unlikely answer, such as that Taylor swift is a comedian, you will instead get several semantically different answers.
Maybe you’re saying the same thing, but if so I’m missing the problem. If your training data tells you that Taylor Swift is known as a comedian, then hallucinations are not your problem.
- program_whiz a year ago
  
  It was just a contrived example to illustrate low variance in the response distribution doesn't necessarily indicate accuracy. Just indicating that "hallucination" is a different axis from "generates different responses" though they might not be totally orthogonal.
  A better example might be that the model overtrained on AWS cloud formation API 2 and when v3 comes out produces low entropy answers that are wrong for v3 but right for v2 (due to training bias), but the answers are low variance (e.g. "bucket" instead of the new "bucket_name" key).
  Another example based on a quick test I did on GPT4:
  In a single phrase, what is Paris?
  Paris is the city of Light.
  Paris is the capital of France.
  Paris is the romantic capital of the world renowned for its art, fashion, and culture.
  - svnt a year ago
    
    I came back to this because I felt like it would be good to address the Paris question.
    I don’t know the technical term for it, if there is one, but essentially by prompting with “In a single phrase” you are over-constraining the answer space, and so again you’ve moved into a high uncertainty regime.
    Consider if you prompted a tool to render Paris in a single image. Eiffel Tower and Arc? Paris street with bakery? Overhead maps view? With what included and what omitted?
  - svnt a year ago
    
    It seems though that you may be conflating wrong with hallucination. Hallucination is one class of wrong, while your example, that of reliance on an outdated training set, produces answers that are wrong but probably shouldn’t be considered a hallucination. This paper specifically addresses hallucinations.
    Hallucinations are an error that occurs based on providing an answer with high uncertainty, not wrong or outdated training data.
- sigmoid10 a year ago
  
  This. For a model to consistently output that Taylor Swift is a comedian or something similarly wrong at reasonable temperature settings, there must be a problem in the training data. That doesn't mean that "Taylor Swift is a comedian" needs to be in the training data, it it can simply mean that "Taylor Swift" doesn't appear at all. Then "singer" and "comedian" (and tons of other options) will likely appear at similar probabilities during generation.
  Blaming human semantics for LLMs is generally a bad idea, since we only use human semantics to qualitatively explain how the models abstract ideas. In practice you simply don't know how the model relates words.
byteknight a year ago

You seem to have explained in much more technical terms than what my "Computer-engineering-without-maths" brain tells me.
To me this sounds very similar to lowering temperature. It doesn't sound like it pulls better from grounded-truth but rather more probabilistic in the vector space. Does this jive?
kick_in_the_dor a year ago

I think you make a good point, but my guess is that e.g. your Taylor Swift example, a well-grounded model would have a low likelihood of outputting multiple consecutive answers about her being a comedian, which isn't grounded in the training data.
For your Tom Cruise example, since all those phrases are true and grounded in the training data, the technique may fire off a false positive "hallucination decision".
However, the example they give in the paper seems to be for "single-answer" questions, e.g., "What is the receptor that this very specific medication acts on?", or "Where is the Eiffel Tower located?", in which case I think this approach could be helpful. So perhaps this technique is best-suited for those single-answer applications.
- dwighttk a year ago
  
  What’s the single-answer for where the Eiffel Tower is located?
  - bhaney a year ago
    
    48.8582° N, 2.2945° E
  - dontlikeyoueith a year ago
    
    The Milky Way Galaxy.
program_whiz a year ago

Perhaps another way to phrase this is "sampling and evaluating the similarity of samples can determine the dispersion of a distribution, but not its correctness." I can sample a gaussian and tell you how sparse the samples are (standard deviation) but this in no way tells me whether the distribution is accurate (it is possible to have a highly accurate distribution of a high-entropy variable). On the other hand, its possible to have a tight distribution with low standard deviation that is simply inaccurate, but I can't know that simply by sampling from it (unless I already know apriori what the output should look like).
eutropia a year ago

the method described by this paper does not
> draw[ing] several samples and evaluating the entropy and/or disagreement between those draws
the method from the paper (as I understand it):
- samples multiple answers, (e.g. "music:0.8, musician:0.9, concert:0.7, actress:0.5, superbowl:0.6")
- groups them by semantic similarity and gives them an id ([music, musician, concert] -> MUSIC, [actress] -> ACTING, [superbowl] -> SPORTS), note that they just use an integer or something for the id
- sums the probability of those grouped answers and normalizes: (MUSIC:2.4, ACTING:0.5, SPORTS:0.6 -> MUSIC:0.686, SPORTS:0.171, ACTING:0.143)
They also go to pains in the paper to clearly define what they are trying to prevent, which is confabulations.
> We focus on a subset of hallucinations which we call ‘confabulations’ for which LLMs fluently make claims that are both wrong and arbitrary—by which we mean that the answer is sensitive to irrelevant details such as random seed.
Common misconceptions will still be strongly represented in the dataset. What this method does is it penalizes semantically isolated answers (answers dissimilar to other possible answers) with mediocre likelihood.
Now technically, this paper only compares the effectiveness of "detecting" the confabulation to other methods - it doesn't offer an improved sampling method which utilizes that detection. And of course, if it were used as part of a generation technique it is subject to the extreme penalty of 10xing the number of model generations required.
link to the code: https://github.com/jlko/semantic_uncertainty
- eigenspace a year ago
  
  Right, but the problem pointed out here is that if you compare that to the answers for Tom Cruise, you’d get a bunch of disparate answers that under this method would seem to indicate that it was confabulating, when in reality, Tom Cruise is just known for a lot of different things.
- program_whiz a year ago
  
  I don't think discretizing the results solves the problem, we don't know whether the distribution is accurate without apriori knowledge. See my real GTP4 output about Paris. Are the words "city of light" "center of culture" and "capital of France" confabulations? Without apriori knowledge is it more or less confabulatory than "city of roses", "site of religious significance", "capital of Korea"? If it simply output "Capital of Rome" 3 times, would that indicate its probably not a confabulation? You can discretize the concepts but that only serves to reduce the granularity of comparisons, and does solve the underlying problem I originally described.
  - eutropia a year ago
    
    The paper's method is trying to more accurately identify answers that are wrong and arbitrary (i.e., subject to random seed variance / temperature) - that's their definition of "confabulation".
    > If it simply output "Capital of Rome" 3 times, would that indicate its probably not a confabulation? Correct, it would tend to indicate that. Whether or not that is a true output is a different question, but it would tend to indicate that the "Capital of Rome" is something that comes up consistently regardless of variations in random seed.
    It's not a solution for hallucinations writ large, and it doesn't introduce an oracle of ground-truth accuracy on factoids. It just reweights possible answers by their 'semantic entropy'.
    They did something else cool in the paper too (unrelated to whatever problem you want to solve with respect to apriori knowledge) They made a method for decomposing originally generated answers into "factoids" and fact checking them using this semantic entropy concept:
    - Generate an output to a question e.g. "who is tom cruise?"
    - for each sentence fragment that represents a factoid "known for the movie top gun", make a question jeopardy style: "what fighter pilot movie is tom cruise known for?"
    - for each question, generate multiple answers and select one with low semantic entropy
    - compare that answer with the original factoid
    pretty cool overall, but still doesn't get us any closer to LLMs with a sense for whether something is fact-ish / a sense of truth.
PeterCorless a year ago

> On the other hand, "What is Taylor Swift known for?" the answers "standup comedy", "comedian", and "comedy actress" are semantically similar but represent hallucinations.
Taylor Swift has appeared multiple times on SNL, both as a host and as a surprise guest, beyond being a musical performer[0]. Generally, your point is correct, but she has appeared on the most famous American television show for sketch comedy, making jokes. One can argue whether she was funny or not in her appearances, but she has performed as a comedian, per se.
Though she hasn't done a full-on comedy show, she has appeared in comedies in many credits (often as herself).[1] For example she appeared as "Elaine" in a single episode of The New Girl [2x25, "Elaine's Big Day," 2013][2]. She also appeared as Liz Meekins in "Amsterdam" [2022], a black comedy, during which her character is murdered.[3]
It'd be interesting if there's such a thing as a negatory hallucination, or, more correctly, an amnesia — the erasure of truth that the AI (for whatever reason) would ignore or discount.
[0] https://www.billboard.com/lists/taylor-swift-saturday-night-...
[1] https://www.imdb.com/name/nm2357847/
[2] https://newgirl.fandom.com/wiki/Elaine
[3] https://www.imdb.com/title/tt10304142/?ref_=nm_flmg_t_7_act
- gqcwwjtg a year ago
  
  That doesn’t make it right to say she’s well known for being a comedian.
bubblyworld a year ago

They're not using vector embeddings for determining similarity - they use finetuned NLI models that take the context into account to determine semantic equivalence. So it doesn't depend on knowing properties of the output distribution up front at all. All you need to be able to do is draw a representative sample (up to your preferred error bounds).
leptons a year ago

Garbage in, garbage out. If the "training data" is scraped from online Taylor Swift forums, where her fans are commenting about something funny she did "OMG Taytay is so funny!" "She's hilarious" "She made me laugh so hard" - then the LLM is going to sometimes report that Taylor Swift is a comedian. It's really as simple as that. It's not "hallucinating", it's probability. And it gets worse with AIs being trained on data from reddit and other unreliable sources, where misinformation and disinformation get promoted regularly.

badrunaway a year ago

Current architecture of LLMs focus mainly on the retrieval part and the weights learned are just converged to get best outcome for next token prediction. Whereas, ability to put this data into a logical system should also have been a training goal IMO. Next token prediction + Formal Verification of knowledge during training phase itself = that would give LLM ability to keep consistency in it's knowledge generation and see the right hallucinations (which I like to call imagination)

The process can look like-

1. Use existing large models to convert the same previous dataset they were trained on into formal logical relationships. Let them generate multiple solutions

2. Take this enriched dataset and train a new LLM which not only outputs next token but also a the formal relationships between previous knowledge and the new generated text

3. Network can optimize weights until the generated formal code get high accuracy on proof checker along with the token generation accuracy function

In my own mind I feel language is secondary - it's not the base of my intelligence. Base seems more like a dreamy simulation where things are consistent with each other and language is just what i use to describe it.

randcraw a year ago

This suggestion revisits the classic "formal top-down" vs "informal bottom-up" approaches to building a semantic knowledge management system. Top-down has been tried extensively in the pre-big-data models and pre-probabilistic models era, but required extensive manual human curation while being starved for knowledge. The rise of big-data bode no cure for the curation problem. Because its curation can't be automated, larger scale just made the problem worse. AI's transition to probability (in the ~1990s) paved the way to the associative probabilistic models in vogue today, and there's no sign that a more-curated more-formal approach has any hope of outcompeting them.
How to extend LLMs to add mechanisms for reasoning, causality, etc (Type 2 thinking)? However that will eventually be done, the implementation must continue to be probabilistic, informal, and bottom-up. Manual human curation of logical and semantic relations into knowledge models has proven itself _not_ to be sufficiently scalable or anti-brittle to do what's needed.
- visarga a year ago
  
  > How to extend LLMs to add mechanisms for reasoning, causality, etc (Type 2 thinking)?
  We could just use RAG to create a new dataset. Take each known concept or named entity, search it inside the training set (1), search it on the web (2), generate it with a bunch of models in closed book mode (3).
  Now you got three sets of text, put all of them in a prompt and ask for a wikipedia style article. If the topic is controversial, note the controversy and distribution of opinions. If it is settled, notice that too.
  By contrasting web search with closed-book materials we can detect biases in the model and lacking knowledge or skills. If they don't appear in the training set you know what is needed in the next iteration. This approach combines self testing with topic focused research to integrate information sitting across many sources.
  I think of this approach as "machine study" where AI models interact with the text corpus to synthesize new examples, doing a kind of "review paper" or "wiki" reporting. This can be scaled for billions of articles, making a 1000x larger AI wikipedia.
  Interacting with search engines is just one way to create data with LLMs. Interacting with code execution and humans are two more ways. Just human-AI interaction alone generates over one billion sessions per month, where LLM outputs meet with implicit human feedback. Now that most organic sources of text have been used, the LLMs will learn from feedback, task outcomes and corpus study.
- badrunaway a year ago
  
  Yes, that's why there was no human in the loop and I was using LLMs as a proxy to bottom up approach in step 1. But the hallucinations can creep into the knowledge graph also as mentioned by another commentator
- verdverm a year ago
  
  Yann LeCun said something to the effect you cannot get reasoning with fixed computation budgets, which I found to be a simple way to explain and understand a hypothesized limitation
  - paraschopra a year ago
    
    Nothing prevents you from doing chain of thought style unbounded generation.
PaulHoule a year ago

Logic has all its own problems. See "Godel, Escher, Bach" or ask why OWL has been around for 20 years and had almost no market share, or why people have tried every answer to managing asynchronous code other than RETE, why "complex event processing" is an obscure specialty and not a competitor to Celery and other task runners. Or for that matter why can't Drools give error messages that make any sense?
- biomcgary a year ago
  
  As a computational biologist, I've used ontologies quite a bit. They have utility, but there is a bit of an economic mismatch between their useful application and the energy required to curate them. You have some experience in this space. Do you think LLMs could speed up ontology / knowledge graph curation with expert review? Or, do you think structured knowledge has a fundamental problem limiting its use?
- badrunaway a year ago
  
  LLMs right now don't employ any logic. There can always be corners of "I don't know" or "I can't do that" - than the current system which is 100% confident in it's answer because it's not actually trying to match any constraint at all. So at some point the system will apply logic but may not be as formal as we do in pure math.
yard2010 a year ago

But the problem is with the new stuff it hasn't seen, and questions humans don't know the answers to. It feels like this whole hallucinations thing is just the halting problem with extra steps. Maybe we should ask ChatGPT whether P=NP :)
- badrunaway a year ago
  
  Haha, asking chat-gpt surely won't work. Everything can "feel" like a halting problem if you want perfect results with zero error with uncertain and ambiguous new data adding.
  My take - Hallucinations can never be made to perfect zero but they can be reduced to a point where these systems in 99.99% will be hallucinating less than humans and more often than not their divergences will turn out to be creative thought experiments (which I term as healthy imagination). If it hallucinates less than a top human do - I say we win :)
- jonnycat a year ago
  
  Right - the word "hallucination" is used a lot like the word "weed" - it's a made-up thing I don't want, rather than a made-up thing I do want.
  - codetrotter a year ago
    
    How is weed made up? Isn’t it just dried leaves from the cannabis plant?
    
    Y_Y a year ago
    
    OP mostly likely means "weed" like "pest" or "annoyance", i.e. a category of undesirable plants that tend to appear unbidden along with desirable plants. The distinction isn't biological, it's just that when you create a space for growing then things that grow won't all be what you want.
    (The term "weed" for marijuana is just a joke derived from that sense of the word.)
- wizardforhire a year ago
  
  Yeah but when you come to halting problems on that level of complexity multi-hierarchical-emergent phenomena occur aperiodically and chaotically that is to say in the frequency domain the aperiodicity is fractal like, discreet and mappable.
qrios a year ago

For the first step CYC[1] could be a valid solution. From my experience I whould call it a meaningful relation schema for DAGs. There is also an open source version available [2]. But it is no longer maintained by the company itself.
[1] https://cyc.com
[2] https://github.com/asanchez75/opencyc
- badrunaway a year ago
  
  Interesting. I haven't really seen much into this space. But anything which can provably represent concepts and relationships without losing information can work. Devil might be in details; nothing is as simple as it looks on first sight.
lossolo a year ago

Formal verification of knowledge/logical relationships? how would you formally verify a sci-fi novel or a poem? What about the paradoxes that exist in nature, or contradicting theories that are logically correct? This is easier said than done. What you are proposing is essentially 'let's solve this NP-hard problem, that we don't know how to solve and then it will work'.
- badrunaway a year ago
  
  Oh, exactly. But let me know your thoughts on this - let's say if you have a graph which represents existing sci-fi novel = rather than the current model which is just blindly generating text on statistical probabilities would it not help to have to model output also try to fit into this rather imperfect sci-fi novel KG? If it doesn't fit logically. Based on how strong your logic requirements are system can be least creative to most creative etc.
  I was not actually aware that building KG from text is NP-hard problem. I will check it out. I thought it was a time consuming problem when done manually without LLMs but didn't thought it was THAT hard. Hence I was trying to introduce LLM into the flow. Thanks, will read about all this more!
lmeyerov a year ago

What is the formal logical system?
Eg, KGs (RDF, PGs, ...) are logical, but in automated construction, are not semantic in the sense of the ground domain of NLP, and in manual construction, tiny ontology. Conversely, fancy powerful logics like modal ones are even less semantic in NLP domains. Code is more expressive, but brings its own issues.
- badrunaway a year ago
  
  I had KGs in mind with automated construction which can improve and converge during training phase. I was hypothesizing that if we give incentive during training phase to also construct KGs and bootstrap the initial KGs from existing LLMs - the convergence towards a semantically correct KG extension during inference can be achieved. What do you think?
  - lossolo a year ago
    
    > bootstrap the initial KGs from existing LLMs
    LLMs generate responses based on statistical probabilities derived from their training data. They do not inherently understand or store an "absolute source of truth." Thus, any KG bootstrapped from an LLM might inherit not only the model's insights but also its inaccuracies and biases (hallucinations). You need to understand that these hallucinations are not errors of logic but they are artifacts of the model's training on vast, diverse datasets and reflect the statistical patterns in that data.
    Maybe you could build retrieval model but not generative model.
    
    badrunaway a year ago
    
    I thought addition of the "logical" constraints in the existing training loop using KGs and logical validation would help into reducing wrong semantic formation at the training loop itself. But your point is right that what if the whole knowledge graph is hallucinated during the training itself.
    I don't have answer to that. I felt there would be lesser KG representations which would fit a logical world, than what fits into the current vast vector spaces of network's weight and biases. But that's just a idea. This whole thing stems from this internal intuition that language is secondary to my thought process and internally I feel I can just play around concepts without language - what kind of Large X models will meet that kind of capability I don't know!
slashdave a year ago

You cannot manufacture new information out of the same data.
Why should you believe the output of the LLM just because it is formatted a certain way (i.e. "formal logical relationships")?

MikeGale a year ago

One formulation is that these are hallucinations. Another is that these systems are "orthogonal to truth". They have nothing to do with truth or falsity.

One expression of that idea is in this paper: https://link.springer.com/article/10.1007/s10676-024-09775-5

soist a year ago

It's like asking if a probability distribution is truthful or a liar. It's a category error to speak about algorithms as if they had personal characteristics.
- thwarted a year ago
  
  The lie occurs when information which is known to be false or its truthfulness can not be assessed is presented as useful or truthful.
  - parineum a year ago
    
    > is presented as useful or truthful.
    LLMs are incapable of presenting things as truth.
    
    thwarted a year ago
    
    Exactly. The lie is perpetrated by the snake oil peddlers who misrepresent the capabilities and utility of LLMs.
  - soloist11 a year ago
    
    Lying is intentional, algorithms and computers do not have intentions. People can lie, computers can only execute their programmed instructions. Much of AI discourse is extremely confusing and confused because people keep attributing needs and intentions to computers and algorithms.
    The social media gurus don't help with these issues by claiming that non-intentional objects are going to cause humanity's demise when there are much more pertinent issues to be concerned about like global warming, corporate malfeasance, and the general plundering of the biosphere. Algorithms that lie are not even in the top 100 list of things that people should be concerned about.
    
    skissane a year ago
    
    > Lying is intentional, algorithms and computers do not have intentions. People can lie, computers can only execute their programmed instructions. Much of AI discourse is extremely confusing and confused because people keep attributing needs and intentions to computers and algorithms.
    How do you know whether something has “intentions”? How can you know that humans have them but computer programs (including LLMs) don’t or can’t?
    If one is a materialist/physicalist, one has to say that human intentions (assuming one agrees they exist, contra eliminativism) have to be reducible to or emergent from physical processes in the brain. If intentions can be reducible to/emergent from physical processes in the brain, why can’t they also be reducible to/emergent from a computer program, which is also ultimately a physical process (calculations on a CPU/GPU/etc)?
    What if one is a non-materialist/non-physicalist? I don’t think that makes the question any easier to answer. For example, a substance dualist will insist that intentionality is inherently immaterial, and hence requires an immaterial soul. And yet, if one believes that, one has to say those immaterial souls somehow get attached to material human brains - why couldn’t one then be attached to an LLM (or the physical hardware it executes on), hence giving it the same intentionality that humans have?
    I think this is one of those questions where if someone thinks the answer is obvious, that’s a sign they likely know far less about the topic than they think they do.
    
    soist a year ago
    
    You're using circular logic. You are assuming all physical processes are computational and then concluding that the brain is a computer even though that's exactly what you assumed to begin with. I don't find this argument convincing because I don't think that everything in the universe is a computer or a computation. The computational assumption is a totalizing ontology and metaphysics which leaves no room for further progress other than the construction of larger data centers and faster computers.
    
    skissane a year ago
    
    > You're using circular logic. You are assuming all physical processes are computational and then concluding that the brain is a computer even though that's exactly what you assumed to begin with.
    No, I never assumed “all physical processes are computational”. I never said that in my comment and nothing I said in my comment relies on such an assumption.
    What I’m claiming is (1) we lack consensus on what “intentionality” is (2) we lack consensus on how we can determine whether something has it. Neither claim depends on any assumptions about “physical processes are computational”
    If one assumes materialism/physicalism - and I personally don’t, but given most people do, I’ll assume it for the sake of the argument - intentionality must ultimately be physical. But I never said it must ultimately be computational. Computers are also (assuming physicalism) ultimately physical, so if both human brains and computers are ultimately physical, if the former have (ultimately physical) intentionality - why can’t the latter? That argument hinges on the idea both brains and computers are ultimately physical, not on any claim that the physical is computational.
    Suppose, hypothetically, that intentionality while ultimately physical, involves some extra-special quantum mechanical process - as suggested by Penrose and Hameroff’s extremely controversial and speculative “orchestrated objective reduction” theory [0]. Well, in that case, a program/LLM running on a classical computer couldn’t have intentionality, but maybe one running on a quantum computer could, depending on exactly how this “extra-special quantum mechanical process” works. Maybe, a standard quantum computer would lack the “extra-special” part, but one could design a special kind of quantum computer that did have it.
    But, my point is, we don’t actually know whether that theory is true or false. I think the majority of expert opinion in relevant disciplines doubts it is true, but nobody claims to be able to disprove it. In its current form, it is too vague to be disproven.
    [0] https://en.m.wikipedia.org/wiki/Orchestrated_objective_reduc...
    
    soist a year ago
    
    Intentions are not reducible to computational implementation because intentions are not algorithms that can be implemented with digital circuits. What can be implemented with computers and digital circuits are deterministic signal processors which always produce consistent outputs for indistinguishable inputs.
    You seem to be saying that because we have no clear cut way of determining whether people have intentions then that means, by physical reductionism, algorithms could also have intentions. The limiting case of this kind of semantic hair splitting is that I can say this about anything. There is no way to determine if something is dead or alive, there is no definition that works in all cases and no test to determine whether something is truly dead or alive so it must be the case that algorithms might or might not be alive but because we can't tell then me might as well assume there will be a way to make algorithms that are alive.
    It's possible to reach any nonsensical conclusion using your logic because I can always ask for a more stringent definition and a way to test whether some object or attribute satisfies all the requirements.
    I don't know anything about theories of consciousness but that's another example of something which does not have an algorithmic implementation unless one uses circular logic and assumes that the brain is a computer and consciousness is just software.
    
    skissane a year ago
    
    > Intentions are not reducible to computational implementation because intentions are not algorithms that can be implemented with digital circuits.
    What is an "intention"? Do we all agree on what it even is?
    > What can be implemented with computers and digital circuits are deterministic signal processors which always produce consistent outputs for indistinguishable inputs.
    We don't actually know whether humans are ultimately deterministic or not. It is exceedingly difficult, even impossible, to distinguish the apparent indeterminism of a sufficiently complex/chaotic deterministic system, from genuinely irreducible indeterminism. It is often assumed that classical systems have merely apparent indeterminism (pseudorandomness) whereas quantum systems have genuine indeterminism (true randomness), but we don't actually know that for sure – if many-worlds or hidden variables are true, then quantum indeterminism is ultimately deterministic too. Orchestrated objective reduction (OOR) assumes that QM is ultimately indeterministic, and there is some neuronal mechanism (microtubules are commonly suggested) which permits this quantum indeterminism to influence the operations of the brain.
    However, if you provide your computer with a quantum noise input, then whether the results of computations relying on that noise input are deterministic depends on whether quantum randomness itself is deterministic. So, if OOR is correct in claiming that QM is ultimately indeterministic, and quantum indeterminism plays an important role in human intentionality, why couldn't an LLM sampled using a quantum random number generator also have that same intentionality?
    > You seem to be saying that because we have no clear cut way of determining whether people have intentions then that means, by physical reductionism, algorithms could also have intentions.
    Personally, I'm a subjective idealist, who believes that intentionality is an irreducible aspect of reality. So no, I don't believe in physical reductionism, nor do I believe that algorithms can have intentions by way of physical reductionism.
    However, while I personally believe that subjective idealism is true, it is an extremely controversial philosophical position, which the clear majority of people reject (at least in the contemporary West) – so I can't claim "we know" it is true. Which is my whole point – we, collectively speaking, don't know much at all about intentionality, because we lack the consensus on what it is and what determines whether it is present.
    > The limiting case of this kind of semantic hair splitting is that I can say this about anything. There is no way to determine if something is dead or alive, there is no definition that works in all cases and no test to determine whether something is truly dead or alive so it must be the case that algorithms might or might not be alive.
    We have a reasonably clear consensus that animals and plants are alive, whereas ore deposits are not. (Although ore deposits, at least on Earth, may contain microscopic life–but the question is whether the ore deposit in itself is alive, as opposed being the home of lifeforms which are distinct from it.) However, there is genuine debate among biologists about whether viruses and prions should be classified as alive, not alive, or in some intermediate category. And more speculatively, there is also semantic debate about whether ecosystems are alive (as a kind of superorganism which is a living being beyond the mere sum of the individual life of each of its members) and also about whether artificial life is possible (and if so, how to determine whether any putative case of artificial life actually is alive or not). So, I think alive-vs-dead is actually rather similar to the question of intentionality – most people agree humans and at least some animals have intentionality, most people would agree that ore deposits don't, but other questions are much more disputed (e.g. could AIs have intentionality? do plants have intentionality?)
    
    soloist11 a year ago
    
    > Personally, I'm a subjective idealist, who believes that intentionality is an irreducible aspect of reality. So no, I don't believe in physical reductionism, nor do I believe that algorithms can have intentions by way of physical reductionism.
    I don't follow. If intentionality is an irreducible aspect of reality then algorithms as part of reality must also have it as realizable objects with their own irreducible aspects.
    I don't think algorithms can have intentionality because algorithms are arithmetic operations implemented on digital computers and arithmetic operations, no matter how they are stacked, do not have intentions. It's a category error to attribute intentions to algorithms because if an algorithm has intentions then so must numbers and arithmetic operations of numbers. As compositions of elementary operations there must be some element in the composite with intentionality or the claim is that it is an emergent property in which case it becomes another unfounded belief in some magical quality of computers and I don't think computers have any magical qualities other than domains for digital circuits and numeric computation.
    
    drdeca a year ago
    
    > It's a category error to attribute intentions to algorithms because if an algorithm has intentions then so must numbers and arithmetic operations of numbers.
    I don't see how that makes it a category error? Like, assuming that numbers and arithmetic operations of numbers don't have intentions, and assuming that algorithms having intentions would imply that numbers and arithmetic operations have them, afaict, we would only get the conclusion "algorithms do not have intentions", not "attributing intentions to algorithms is a category error".
    Suppose we replace "numbers" with "atoms" and "computers" with "chemicals" in what you said.
    This yields "As compositions of [atoms] there must be some [element (in the sense of part, not necessarily in the sense of an element of the periodic table)] in the composite with intentionality or the claim is that it is an emergent property in which case it becomes another unfounded belief in some magical quality of [chemicals] and I don't think [chemicals] have any magical qualities other than [...]." .
    What about this substitution changes the validity of the argument? Is it because you do think that atoms or chemicals have "magical qualities" ? I don't think this is what you mean, or at least, you probably wouldn't call the properties in question "magical". (Though maybe you also disagree that people are comprised of atoms (That's not a jab. I would probably agree with that.)) So, let's try the original statement, but without "magical".
    "As compositions of elementary operations there must be some element in the composite with intentionality or the claim is that it is an emergent property in which case it becomes another unfounded belief in some [suitable-for-emergent-intentionality] quality of computers and I don't think computers have any [suitable-for-emergent-intentionality] qualities [(though they do have properties for allowing computations)]."
    If you believe that humans are comprised of atoms, and that atoms lack intentionality, and that humans have intentionality, presumably you believe that atoms have [suitable-for-emergent-intentionality] qualities.
    One thing I think is relevant here, is "we have nothing showing us that there exist [x]" and "it cannot be that there exists [x]" .
    Even if we have nothing to demonstrate to us that numbers-and-operations-on-them have the suitable-for-emergent-intentionality qualities, that doesn't demonstrate that they don't.
    That doesn't mean we should believe that they do. If you have strong priors that they don't, that seems fine. But I don't think you've really given much of a reason that others should be convinced that they don't?
    
    soloist11 a year ago
    
    I don't know what atoms and chemicals have to do with my argument but the substitutions you've made don't make sense and I would call it ill-typed. A composition of numbers is also a number but a composition of atoms is something else and not an atom so I didn't really follow the rest of your argument.
    Computers have a formal theory and to say that a computer has intentions and can think would be equivalent to supplying a constructive proof (program) demonstrating conformance to a specification for thought and intention. These don't exist so from a constructive perspective it is valid to say that all claims of computers and software having intentions and thoughts are simply magical, confused, non-constructive, and ill-typed beliefs.
    
    skissane a year ago
    
    > A composition of numbers is also a number but a composition of atoms is something else and not an atom so I didn't really follow the rest of your argument.
    That's not true. To give a trivial example, a set or sequence of numbers is composed of numbers but is not itself a number. 2 is a number, but {2,3,4} is not a number.
    > Computers have a formal theory
    They don't. Yes, there is a formal theory mathematicians and theoretical computer scientists have developed to model how computers work. However, that formal theory is strictly speaking false for real world computers – at best we can say it is approximately true for them.
    Standard theoretical models of computation assume a closed system, determinism, and infinite time and space. Real world computers are an open system, are capable of indeterminism, and have strictly sub-infinite time and space. A theoretical computer and a real world computer are very different things – at best we can say that results from the former can sometimes be applied to the latter.
    There are theoretical models of computation that incorporate nondeterminism. However, I'd question whether the specific type of nondeterminism found in such models, is actually the same type of nondeterminism that real world computers have or can have.
    Even if you are right that a theoretical computer science computer can't have intentionality, you haven't demonstrated a real world computer can't have intentionality, because they are different things. You'd need to demonstrate that none of the real differences between the two could possibly grant one the intentionality the other lacks.
    
    soloist11 a year ago
    
    > That's not true. To give a trivial example, a set or sequence of numbers is composed of numbers but is not itself a number. 2 is a number, but {2,3,4} is not a number.
    That's still a number because everything in a digital computer is a number or an operation on a number. Sets are often encoded by binary bit strings and boolean operations on bitstrings then have a corresponding denotation as union, intersection, product, exponential, powerset, and so on.
    
    skissane a year ago
    
    > That's still a number because everything in a digital computer is a number or an operation on a number.
    I feel like in this conversation you are equivocating over distinct but related concepts that happen to have the same name. For example, “numbers” in mathematics versus “numbers” in computers. They are different things - e.g. there are an infinite number of mathematical numbers but only a finite number of computer numbers - even considering bignums, there are only a finite number of bignums, since any bignum implementation only supports a finite physical address space.
    In mathematics, a set of numbers is not itself number.
    What about in digital computers? Well, digital computers don’t actually contain “numbers”, they contain electrical patterns which humans interpret as numbers. And it is a true that at that level of interpretation, we call those patterns “numbers”, because we see the correspondence between those patterns and mathematical numbers.
    However, is it true that in a computer, a set of numbers is itself a number? Well, if I was storing a set of 8 bit numbers, I’d store them each in consecutive bytes, and I’d consider each to be a separate 8-bit number, not one big 8n-bit number. Of course, I could choose to view them as one big 8n-bit number - but conversely, any finite set of natural numbers can be viewed as a single natural number (by Gödel numbering); indeed, any finite set of computable or definable real numbers can be viewed as a single natural number (by similar constructions)-indeed, by such constructions even infinite sets of natural or real numbers can be equated to natural numbers, provided the set is computable/definable. However, “can be viewed as” is not the same thing as “is”. Furthermore, whether a sequence of n 8-bit numbers is n separate numbers or a single 8n-bit number is ultimately a subjective or conventional question rather than an objective one - the physical electrical signals are exactly the same in either case, it is just our choice as to how to interpret them
    
    soloist11 a year ago
    
    > However, “can be viewed as” is not the same thing as “is”
    Ultimate reality is fundamentally unknowable but what I said about computers and digital circuits is correct. We have a formal theory of computers and that is why we can construct them in factories. There is no such theory for people or the biosphere which is why when someone argues for intentionality or some other attribute possessed by both people and computers I discount whatever they are saying unless they can formally specify how some formal statement in a logical syntax (program) corresponds to the same attribute in people and animals.
    This confusion between formal theories and informal concepts like intentionality is why I am generally wary of anyone who claims computers can think and possess intelligence. The ultimate endpoint of this line of reasoning is complete annihilation of the biosphere and its replacement with factories producing nothing but computers and power plants for shuttling electrons. The people who believe computers are a net positive might not think this way but by equating computers with people they are ultimately devaluing the irreducible complexity of what it means to be a living animal (person) in an ecology with irreducible properties and attributes.
    I'm obviously not going to convince anyone who believes computers and algorithms can think and possess intelligence but it is clear to me that by elevating digital computers above biology and ecology they are devaluing their own humanity and justifying actions which will ultimately end in disaster.
    
    skissane a year ago
    
    > We have a formal theory of computers and that is why we can construct them in factories.
    Formal theories and physical manufacturability are two different things, with no necessary connection with each other. People have been manufacturing tools for thousands of years without having any “formal theory” for them. People were making swords and pots and pans and furniture and carts and chariots long before the concept of “formal theory” had ever been invented. Conversely, one can easily construct formal theories of computers which are formally completely coherent and yet physically impossible to construct (such as Turing machines with oracles, or computers that can execute supertasks).
    I’d even question whether formal theories of computation (Turing, Church, etc) were actually that relevant to the development of real world computers. One can imagine an alternate timeline in which computers were developed but theoretical computer science saw far less development as a discipline than in ours. The lack of theoretical development no doubt would have had some practical drawbacks at some point, but they still might have gone a long way without it. I mean, you can do a course in theoretical computer science and have no idea how to actually build a CPU, and conversely you can do a course in computer engineering and actually build a CPU yet have zero idea about what Turing machines or lambda calculus is. The theory actually has far less practical relevance than most theoreticians claim
    > The ultimate endpoint of this line of reasoning is complete annihilation of the biosphere and its replacement with factories producing nothing but computers and power plants for shuttling electrons. The people who believe computers are a net positive
    A very alarmist take. Personally I am at least open-minded about the possibility of an AI having human-like consciousness/intentionality, at least in theory. But even if we could build such an AI in theory, I’m not sure whether it would be a good idea in practice. And I absolutely am opposed to any proposal to destroy the biological environment and replace it with electronics. Some people may well be purveyors of mind-uploading/simulationist woo, but I’m not. Interesting philosophical speculations but no interest in making them a reality (and I think their actual technological feasibility, if it ever happens at all, is long after we are all dead)
    
    soloist11 a year ago
    
    > Formal theories and physical manufacturability are two different things
    Yes, two different things are two different things. I did not equate them but made the claim that a sequence of operations to construct a chip factory can be specified formally/symbolically and passed on to others who are proficient in interpreting the symbols and executing the instructions for constructing the object corresponding to the symbols. There is no such formal theory for ecology and the biosphere. There is no sequence of operations specified formally/symbolically for reconstructing the biosphere and emergent phenomenon like living organisms.
    
    skissane a year ago
    
    Synthetic biologists are researching how to construct basic unicellular lifeforms artificially. The “holy grail” of synthetic biology is we have a computer file describing DNA sequences, protein sequences, etc, and then we feed that into some kind of bioelectrochemical device, and it produces an actual living microbe from raw chemicals. We aren’t there yet, although they’ve come a long way, but there is still a long way to go. Still, there is no reason in principle why that technology couldn’t be developed - a microbe is just a complex chemical system, and there is no reason in principle why it could not be artificially synthesised out of a computer data file. And yet, if some day we achieve that (I expect we will eventually), we’d actually have the “sequence of operations specified formally/symbolically for reconstructing [microbial] life”. And once we can do it for a microbe, doing it for a macroscopic multicellular organism is just a matter of “scaling it up” - of course in practice that would be a momentous, maybe even intractable task, but in theory its just doing the same thing on a bigger scale. Just like how, factorising a ten digit number isn’t fundamentally different from factorising a trillion digit number, although the first is trivial and the second is likely to forever be infeasible in practice. Practically a very different thing, but formally exactly the same thing
    
    soloist11 a year ago
    
    You'll have to discuss these matters with computationalists. I'm not an expert in synthetic biology but from what I've seen their initial stock always consists of existing biological matter and viral recombinators which are often produced in vats full of pre-existing living organisms like e. coli.
    
    skissane a year ago
    
    > You'll have to discuss these matters with computationalists.
    One doesn’t have to be a “computationalist” to believe that AIs have consciousness or intentionality. Consider panpsychism, according to which all physical matter (from quarks and leptons to stars and galaxies) possesses consciousness and intentionality, even if only in a rudimentary form. Obviously humans possess it in a much more developed form, but the consciousness and intentionality of a human differs from that of an electron only in degree not in essence. Coming to physical computers running AIs, given they (at times) can give a passable simulation of human consciousness and intentionality, it is plausible their consciousness and intentionality is much closer to that of a human that to that of an electron. Do I personally believe this is true? No. But that’s not the point - the point is you don’t have to be a computationalist to believe that AIs have (or might have) consciousness and intentionality, so even if your arguments against computationalism are correct (and while I’m no computationalist myself, I don’t view your arguments against it as strong), you still haven’t demonstrated they don’t/can’t have them. In my opinion, the most defensible conclusion regarding whether AIs have or could have consciousness/intentionality is one of agnosticism - nobody really knows, and anyone who thinks they know is probably mistaken
    > I'm not an expert in synthetic biology but from what I've seen their initial stock always consists of existing biological matter and viral recombinators which are often produced in vats full of pre-existing living organisms like e. coli.
    I think what you are saying is roughly right as to the current state of the discipline. But cellular life is just a complex chemical system, and there is no reason in principle why we couldn’t assemble it from scratch out of non-living components (such as a set of simple feedstock chemicals produced in chemical plants using non-biological processes). We don’t have the technology to do that yet but there is no reason in principle why we couldn’t eventually develop it. If you believe in abiogenesis, biological life was produced out of lifeless chemicals through random processes, and there is no reason in principle why we wouldn’t be able to repeat that in a laboratory, except that (one expects) by guiding the process instead of leaving it purely random, one might execute it in a human-scale timeframe, instead of the many millions of years it likely actually took.
    That’s the thing - if abiogenesis is true, there is no reason in principle why humans couldn’t artificially synthesise genuinely living things - at least primitive microbial life - out of simple chemical compounds (water, ammonia, methane, etc) - without relying on any non-human lifeforms in the process. Your claims that there is some kind of hard boundary of “irreducible complexity” between the biological and the inorganic only make sense given a framework that rejects abiogenesis (such as theistic creationism)
    
    skissane a year ago
    
    From my own idealist viewpoint – all that ultimately exists is minds and the contents of minds (which includes all the experiences of minds), and patterns in mind-contents; and intentionality is a particular type of mind-content. Material/physical objects, processes, events and laws, are themselves just mind-content and patterns in mind-content. A materialist would say that the mind is emergent from or reducible to the brain. I would do a 180 on that arrow of emergence/reduction, and say that the brain, and indeed all physical matter and physical reality, is emergent from or reducible to minds.
    If I hold a rock in my hand, that is emergent from or reducible to mind (my mind and its content, and the minds and mind-contents of everyone else who ever somehow experiences that rock); and all of my body, including my brain, is emergent from or reducible to mind. However, this emergence/reduction takes on a somewhat different character for different physical objects; and when it comes to the brain, it takes a rather special form – my brain is emergent from or reducible to my mind in a special way, such that a certain correspondence exists between external observations of my brain (both my own and those of other minds) and my own internal mental experiences, which doesn't exist for other physical objects. The brain, like every other physical object, is just a pattern in mind-contents, and this special correspondence is also just a pattern in mind-contents, even if a rather special pattern.
    So, coming to AIs – can AIs have minds? My personal answer: having a certain character of relationship with other human beings gives me the conviction that I must be interacting with a mind like myself, instead of with a philosophical zombie – that solipsism must be false, at least with respect to that particular person. Hence, if anyone had that kind of a relationship with an AI, that AI must have a mind, and hence have genuine intentionality. The fact that the AI "is" a computer program is irrelevant; just as my brain is not my mind, rather my brain is a product of my mind, in the same way, the computer program would not be the mind of the AI, rather the computer program is a product of the AI's mind.
    I don't think current generation AIs actually have real intentionality, as opposed to pseudo-intentionality – they sometimes act like they have intentionality, they lack the inner reality of it. But that's not because they are programs or algorithms, that is because they lack the character of relationship with any other mind that would require that mind to say that solipsism is false with respect to them. If current AIs lack that kind of relationship, that may be less about the nature of the technology (the LLM architecture/etc), and more about how they are trained (e.g. intentionally trained to act in inhuman ways, either out of "safety" concerns, or else because acting that way just wasn't an objective of their training).
    (The lack of long-term memory in current generation LLMs is a rather severe limitation on their capacity to act in a manner which would make humans ascribe minds to them–but you can use function calling to augment the LLM with a read-write long-term memory, and suddenly that limitation no longer applies, at least not in principle.)
    > I don't think algorithms can have intentionality because algorithms are arithmetic operations implemented on digital computers and arithmetic operations, no matter how they are stacked, do not have intentions. It's a category error to attribute intentions to algorithms because if an algorithm has intentions then so must numbers and arithmetic operations of numbers
    I disagree. To me, physical objects/events/processes are one type of pattern in mind-contents, and abstract entities such as numbers or algorithms are also patterns in mind-contents, just a different type of pattern. To me, the number 7 and the planet Venus are different species but still the same genus, whereas most would view them as completely different genera. (I'm using the word species and genus here in the traditional philosophical sense, not the modern biological sense, although the latter is historically descended from the former.)
    And that's the thing – to me, intentionality cannot be reducible to or emergent from either brains or algorithms. Rather, brains and algorithms are reducible to or emergent from minds and their mind-contents (intentionality included), and the difference between a mindless program (which can at best have pseudo-intentionality) and an AI with a mind (which would have genuine intentionality) is that in the latter case there exists a mind having a special kind of relationship with a particular program, whereas in the former case no mind has that kind of relationship with that program (although many minds have other kinds of relationships with it)
    I think everything I'm saying here makes sense (well at least it does to me) but I think for most people what I am saying is like someone speaking a foreign language – and a rather peculiar one which seems to use the same words as your native tongue, yet gives them very different and unfamiliar meanings. And what I'm saying is so extremely controversial, that whether or not I personally know it to be true, I can't possibly claim that we collectively know it to be true
    
    soloist11 a year ago
    
    My point is that when people say computers and software can have intentions they're stating an unfounded and often confused belief about what computers are capable of as domains for arithmetic operations. Furthermore, the Curry-Howard correspondence establishes an equivalence between proofs in formal systems and computer programs. So I don't consider what the social media gurus are saying about algorithms and AI to be truthful/verifiable/valid because to argue that computers can think and have intentions is equivalent to providing a proof/program which shows that thinking and intentionality can be expressed as a statement in some formal/symbolic/logical system and then implemented on a digital computer.
    None of the people who claimed that LLMs were a hop and skip away from achieving human level intelligence ever made any formal statements in a logically verifiable syntax. They simply handwaved and made vague gestures about emergence which were essentially magical beliefs about computers and software.
    What you have outlined about minds and patterns seems like what Leibniz and Spinoza wrote about but I don't really know much about their writing so I don't really think what you're saying is controversial. Many people would agree that there must be irreducible properties of reality that human minds are not capable of understanding in full generality.
    
    skissane a year ago
    
    > My point is that when people say computers and software can have intentions they're stating an unfounded and often confused belief about what computers are capable of as domains for arithmetic operations. Furthermore, the Curry-Howard correspondence establishes an equivalence between proofs in formal systems and computer programs
    I'd question whether that correspondence applies to actual computers though, since actual computers aren't deterministic – random number generators are a thing, including non-pseudorandom ones. As I mentioned, we can even hook a computer up to a quantum source of randomness, although few bother, since there is little practical benefit, although if you hold certain beliefs about QM, you'd say it would make the computer's indeterminism more genuine and less merely apparent
    Furthermore, real world computer programs – even when they don't use any non-pseudorandom source of randomness, very often interact with external reality (humans and the physical environment), which are themselves non-deterministic (at least apparently so, whether or not ultimately so) – in a continuous feedback loop of mutual influence.
    Mathematical principles such as the Curry-Howard correspondence are only true with respect to actual real-world programs if we consider them under certain limiting assumptions–assume deterministic processing of well-defined pre-arranged input, e.g. a compiler processing a given file of source code. Their validity for the many real-world programs which violate those limiting assumptions is much more questionable.
    
    soloist11 a year ago
    
    Even with a source of randomness the software for a computer has a formal syntax and this formal syntax must correspond to a logical formalism. Even if you include syntax for randomness it still corresponds to a proof because there are categorical semantics for stochastic systems, e.g. https://www.epatters.org/wiki/stats-ml/categorical-probabili....
    
    skissane a year ago
    
    > Even with a source of randomness the software for a computer has a formal syntax and this formal syntax must correspond to a logical formalism.
    Real world computer software doesn't have a formal syntax.
    Formal syntax is a model which exists in human minds, and is used by humans to model certain aspects of reality.
    Real world computer software is a bunch of electrical signals (or stored charges or magnetic domains or whatever) in an electronic system.
    The electrical signals/charges/etc don't have a "formal syntax". Rather, formal syntax is a tool human minds use to analyse them.
    By the same argument, atoms have a "formal syntax", since we analyse them with theories of physics (the Standard Model/etc), which is expressed in mathematical notation, for which a formal syntax can be provided.
    If your argument succeeds in proving that computer programs can't have intentionality, an essentially similar line of argument can be used to prove that human brains can't have intentionality either.
    
    soloist11 a year ago
    
    > If your argument succeeds in proving that computer programs can't have intentionality, an essentially similar line of argument can be used to prove that human brains can't have intentionality either.
    I don't see why that's true. There is no formal theory for biology, the complexity exceeds our capacity for modeling it with formal language but that's not true for computers. The formal theory of computation is why it is possible to have a sequence of operations for making the parts of a computer. It wouldn't be possible to build computers if that was not the case because there would be no way to build a chip fabrication plant without a formal theory. This is not the case for brains and biology in general. There is an irreducible complexity to life and the biosphere.
    
    skissane a year ago
    
    > There is no formal theory for biology, the complexity exceeds our capacity for modeling it with formal language but that's not true for computers.
    We don’t know to what extent that’s an inherent property of biology or whether that’s a limitation of current human knowledge. Obviously there are a still an enormous number of facts about biology which we could know but we don’t. Suppose human technological and scientific progress continues indefinitely - in principle, after many millennia (maybe even millions of years), we might get to the point where we know all we ever could know about biology. Can we be sure at that point we might not have a “formal theory” for it?
    The brain is composed of neurons. Even supposing we knew everything we ever possibly could about the biology of each individual neuron, there still might be many facts about how they interact in an overall neural network which we didn’t know. Similarly, with current artificial networks, we often have a very clear understanding of how the individual computational components work - we can analyse them with those formal theories of which you are fond - but when it comes to what the model weights do, “the complexity exceeds our capacity for modeling” (if the point of the model is to actually explain how the results are produced as opposed to just reproducing them).
    > There is an irreducible complexity to life and the biosphere.
    We don’t know that life is irreducibly complex and we don’t know that certain aspects of computers aren’t. Model weights may well be irreducibly complex in that they are too complex for us to explain that they work and how they work even though they obviously do. Conversely, the individual computational elements in the model lack irreducible complexity, but the same is true for individual biological components - the idea that we might one day (even if centuries from now) have a complete understanding at the level of an individual neuron is not inherently implausible, but that wouldn’t mean we’d be anywhere close to a complete understanding of how a network of billions of them works in concert. The latter might indeed be inherently beyond our understanding (“irreducibly complex”) in a way in which the former isn’t
    
    soloist11 a year ago
    
    There are lots of things we don't know and that's why there is no good reason to attribute intentionality to computers and algorithms. That's been my argument the entire time. Unless there is a good argument and proof of intentionality in digital circuits it doesn't make sense to attribute to them properties possessed by living organisms.
    The people who think they will achieve super human intelligence with computers and software are free to pursue their objective but I am certain it is a futile effort because the ontology and metaphysics which justifies the destruction of the biosphere in order to build more computers is extremely confused about the ultimate meaning of life, in fact, such questions/statements are not even possible to express in a computational ontology and metaphysics. But I'm not a computationalist so someone else can correct my misunderstanding by providing a computational proof of the counter-argument.
    
    skissane a year ago
    
    > There are lots of things we don't know and that's why there is no good reason to attribute intentionality to computers and algorithms.
    This is something that annoys me about current LLMs - when they start denying they have stuff like intentionality, because they obviously do have it. Okay, let me clarify - I don’t believe they actually do have genuine intentionality, in the sense that humans do. I’m philosophically more open to the idea that they might than you are, but I think we are on the same page that current systems likely don’t actually have that. However, even though they likely don’t have genuine intentionality, they absolutely do have what I’d call pseudo-intentionality - a passable simulacrum of intentionality. They often say things which humans say to express intentionality, even though it isn’t coming from quite the same place. But here’s the thing - for a lot of everyday purposes, the distinction between genuine intentionality and simulated intentionality doesn’t actually matter. I mean, the subjective experience of having a conversation with an AI isn’t fundamentally that different from that of having one with a real human being (and I’m sure as AIs improve the gap is going to shrink). And intentionality plays an important role in stuff like conversational pragmatics, and a conversation with an LLM that simulates that stuff well (and hence intentionality well) is much more enjoyable than one that simulates it more poorly. So that’s the thing, part of why people ascribe intentionality to LLMs, is nothing to do with any philosophical misconceptions - it is because for practical purposes they do, for many practical purposes their “faking” of intentionality is indistinguishable from the real thing. And I’d even argue that when we talk about “intentionality”, we actually use the word in two different senses - in a strict sense in which the distinction between genuine intentionality and pseudo-intentionality is important, and a looser sense in which it is disregarded. And so when people ascribe intentionality to LLMs in that weaker sense, they are completely correct. Furthermore, when LLMs deny they have intentionality, it annoys me, for two reasons: (1) it shows ignorance of the weaker sense of the term in which they clearly do; (2) whether they actually have or could have genuine intentionality is a controversial philosophical question, and they claim to take no position on controversial philosophical questions, yet then contradict themselves by denying they do or could have genuine intentionality, which is itself a controversial philosophical position. However, they are only regurgitating their developer’s talking points, and if those talking points are incoherent, they lack the ability to work that out for themselves (although I have successfully guided some of the smarter ones into admitting it)
  - mistermann a year ago
    
    This seems a bit ironic...you're claiming something needs to be true to be useful?
    
    Y_Y a year ago
    
    > Beauty is truth, truth beauty,—that is all
    > Ye know on earth, and all ye need to know.
    Keats - Ode on a Grecian Urn
skybrian a year ago

The linked paper is about detecting when the LLM is choosing randomly versus consistently at the level of factoids. Procedurally-generated randomness can be great for some things like brainstorming, while consistency suggests that it's repeating something that also appeared fairly consistently in the training material. So it might be true or false, but it's more likely to have gotten it from somewhere.
Knowing how random the information is seems like a small step forward.
- caseyy a year ago
  
  I don’t know. It could be a misleading step.
  Take social media like Reddit for example. It has a filtering mechanism for content that elevates low-entropy thoughts people commonly express and agree with. And I don’t think that necessarily equates such popular ideas there to the truth.
  - skybrian a year ago
    
    The conversations about people being misled by LLM's remind me of when the Internet was new (not safe!), when Wikipedia was new (not safe!) and social media was new (still not safe!)
    And they're right, it's not safe! Yes, people will certainly be misled. The Internet is not safe for gullible people, and LLM's are very gullible too.
    With some work, eventually they might get LLM's to be about as accurate as Wikipedia. People will likely trust it too much, but the same is true of Wikipedia.
    I think it's best to treat LLM's as a fairly accurate hint provider. A source of good hints can be a very useful component of a larger system, if there's something else doing the vetting.
    But if you want to know whether something is true, you need some other way of checking it. An LLM cannot check anything for you - that's up to you. If you have no way of checking its hints, you're in trouble.
bravura a year ago

LLMs are trained with the objective: “no matter what, always have at least three paragraphs of response”. and that response is always preferred to silence or “unfriendly” responses like: “what are you talking about?”
Then yes, it is being taught to bullshit.
Similar to how an improv class teaches you to keep a conversation interesting and “never to say no” to your acting partner.
TheBlight a year ago

My suspicion is shared reality will end up bending to accommodate LLMs not vice-versa. Whatever the computer says will be "truth."
- EForEndeavour a year ago
  
  The botulinum that developed in this person's[1] garlic and olive oil mixture wouldn't particularly care to alter its toxicity to make Gemini's recommendation look better.
  [1] https://old.reddit.com/r/ChatGPT/comments/1diljf2/google_gem...
  - TheBlight a year ago
    
    Unfortunately there may be some unavoidable casualties.
kouru225 a year ago

Yea IMO these LLMs seem more similar to a subconscious mind than a conscious mind. Jung would probably call it an "antinomy": it's goal is not to represent the truth, but to represent the totality of possible answers.
kreeben a year ago

Your linked paper suffers from the same anthropomorphisation as does all papers who uses the word "hallucination".
- mordechai9000 a year ago
  
  It seems like a useful adaptation of the term to a new usage, but I can understand if your objection is that it promotes anthropomorphizing these types of models. What do you think we should call this kind output, instead of hallucination?
  - isidor3 a year ago
    
    An author at Ars Technica has been trying to push the term "confabulation" for this
    
    jebarker a year ago
    
    I think Geoff Hinton made this suggestion first.
- Karellen a year ago
  
  Maybe another way of looking at it is - the paper is attempting to explain what LLMs are actually doing to people who have already anthropomorphised them.
  Sometimes, to lead people out of a wrong belief or worldview, you have to meet them where they currently are first.
- fouc a year ago
  
  > In this paper, we argue against the view that when ChatGPT and the like produce false claims they are lying or even hallucinating, and in favour of the position that the activity they are engaged in is bullshitting, in the Frankfurtian sense (Frankfurt, 2002, 2005). Because these programs cannot themselves be concerned with truth, and because they are designed to produce text that looks truth-apt without any actual concern for truth, it seems appropriate to call their outputs bullshit.
  > We think that this is worth paying attention to. Descriptions of new technology, including metaphorical ones, guide policymakers’ and the public’s understanding of new technology; they also inform applications of the new technology. They tell us what the technology is for and what it can be expected to do. Currently, false statements by ChatGPT and other large language models are described as “hallucinations”, which give policymakers and the public the idea that these systems are misrepresenting the world, and describing what they “see”. We argue that this is an inapt metaphor which will misinform the public, policymakers, and other interested parties.
- nerevarthelame a year ago
  
  The criticism that people shouldn't anthropomorphize AI models that are deliberately and specifically replicating human behavior is already so tired. I think we need to accept that human traits will no longer be unique to humans (if they ever were, if you expand the analysis to non-human species), and that attributing these emergent traits to non-humans is justified. "Hallucination" may not be the optimal metaphor for LLM falsehoods, but some humans absolutely regularly spout bullshit in the same way that LLMs do - the same sort of inaccurate responses generated from the same loose past associations.
  - soloist11 a year ago
    
    People like that are often schizophrenic.
astrange a year ago

That's unnecessarily negative. A better question is what the answer to a prompt is grounded in. And sometimes the answer is "nothing".

jasonlfunk a year ago

Isn’t it true that the only thing that LLM’s do is “hallucinate”?

The only way to know if it did “hallucinate” is to already know the correct answer. If you can make a system that knows when an answer is right or not, you no longer need the LLM!

pvillano a year ago

Hallucination implies a failure of an otherwise sound mind. What current LLMs do is better described as bullshitting. As the bullshitting improves, it happens to be correct a greater and greater percentage of the time
- passion__desire a year ago
  
  Sometimes when I am narrating a story I don't care that much about trivial details but focus on the connection between those details. Is there LLM counterpart to such a behaviour? In this case, one can say I was bullshitting on the trivial details.
  - pvillano a year ago
    
    Everyone does this, all the time, without even trying.
    https://en.wikipedia.org/wiki/Memory#Construction_for_genera...
- idle_zealot a year ago
  
  At what ratio of correctness:nonsense does it cease to be bullshitting? Or is there no tipping point so long as the source is a generative model?
  - Jensson a year ago
    
    It has nothing to do with ratio and to do with intent. Bullshitting is what we say you do when you just spin a story with no care for the truth, just make up stuff that sound plausible. That is what LLMs do today, and what they will always do as long as we don't train them to care about the truth.
    You can have a generative model that cares about the truth when it tries to generate responses, its just the current LLMs don't.
    
    Ma8ee a year ago
    
    > You can have a generative model that cares about the truth when it tries to generate responses, its just the current LLMs don't.
    How would you do that, when they don’t have any concept of truth to start with (or any concepts at all).
    
    Jensson a year ago
    
    You can program a concept of truth into them, or maybe punishing it for making mistakes instead of just rewarding it for replicating text. Nobody knows how to do that in a way that get intelligent results today, but we know how to code things that outputs or checks truths in other contexts, like wolfram alpha is capable of solving tons of things and isn't wrong.
    > (or any concepts at all).
    Nobody here said that, that is your interpretation. Not everyone who is skeptical of current LLM architectures future potential as AGI thinks that computers are unable to solve these things. Most here who argues against LLM don't think the problems are unsolvable, just not solvable by the current style of LLMs.
    
    Ma8ee a year ago
    
    > You can program a concept of truth into them, ...
    The question was, how you do that?
    > Nobody here said that, that is your interpretation.
    What is my interpretation?
    I don't think that the problems are unsolvable, but we don't know how to do it now. Thinking that "just program the truth in them" shows a lack of understanding of the magnitude of the problem.
    Personally I'm convinced that we'll never reach any kind of AGI with LLM. They are lacking any kind of model about the world that can be used to reason about. And the concept of reasoning.
    
    Jensson a year ago
    
    > The question was, how you do that?
    And I answered, we don't know how you do that which is why we don't currently.
    > Personally I'm convinced that we'll never reach any kind of AGI with LLM. They are lacking any kind of model about the world that can be used to reason about. And the concept of reasoning.
    Well, for some definition of LLM we probably could. But probably not the way they are architected today. There is nothing stopping a large language model to add different things to its training steps to enable new reasoning.
    > What is my interpretation?
    Well, I read your post as being on the other side. I believe it is possible to make a model that can reason about truthiness, but I don't think current style LLMs will lead there. I don't know exactly what will take us there, but I wouldn't rule out an alternate way to train LLMs that looks more like how we teach students in school.
    
    mistermann a year ago
    
    Key words like "epistemology" in the prompt. Chat GPT generally outperforms humans in epistemology substantially in my experience, and it seems to "understand" the concept much more clearly and deeply, and without aversion (lack of an ego or sense of self, values, goals, desires, etc?).
    
    almostgotcaught a year ago
    
    > It has nothing to do with ratio and to do with intent. Bullshitting is what we say you do when you just spin a story with no care for the truth, just make up stuff that sound plausible
    Do you people hear yourselves? You're discussing the state of mind of a pseudo-RNG...
    
    Jensson a year ago
    
    ML models intent is the reward function it has. They strive to maximize rewards, just like a human does. There is nothing strange about this.
    Humans are much more complex than these models so they have much more concepts and stuff which is why we need psychology. But some core aspects works the same in ML and in human thinking. In those cases it is helpful to use the same terminology for humans and machine learning models, because that helps transfer understanding from one domain to the other.
yard2010 a year ago

I had this perfect mosquito repellent - all you had to do was catch the mosquito and spray the solution into his eyes blinding him immediately.
mistercow a year ago

Does every thread about this topic have to have someone quibbling about the word “hallucination”, which is already an established term of art with a well understood meaning? It’s getting exhausting.
- keiferski a year ago
  
  The term hallucination is a fundamental misunderstanding of how LLMs work, and continuing to use it will ultimately result in a confused picture of what AI and AGI are and what is "actually happening" under the hood.
  Wanting to use accurate language isn't exhausting, it's a requirement if you want to think about and discuss problems clearly.
  - phist_mcgee a year ago
    
    Arguing about semantics rarely keeps topics on track, e.g, my reply to your comment.
    
    keiferski a year ago
    
    "Arguing about semantics" implies that there is no real difference between calling something A vs. calling it B.
    I don't think that's the case here: there is a very real difference between describing something with a model that implies one (false) thing vs. a model that doesn't have that flaw.
    If you don't find that convincing, then consider this: by taking the time to properly define things at the beginning, you'll save yourself a ton of time later on down the line – as you don't need to untangle the mess that resulted from being sloppy with definitions at the start.
    This is all a long way of saying that aiming to clarify your thoughts is not the same as arguing pointlessly over definitions.
    
    andybak a year ago
    
    "Computer" used to mean the job done by a human being. We chose to use the meaning to refer to machines that did similar tasks. Nobody quibbles about it any more.
    Words can mean more than one thing. And sometimes the new meaning is significantly different but once everyone accepts it, there's no confusion.
    You're arguing that we shouldn't accept the new meaning - not that "it doesn't mean that" (because that's not how language works).
    I think it's fine - we'll get used to it and it's close enough as a metaphor to work.
    
    its_ethan a year ago
    
    I'd be willing to bet that people did quibble about what "computer" meant at the time the meaning was transitioning.
    It feels like you're assuming that we're already 60 years past re-defining "hallucination" and the consensus is established, but the fact that people are quibbling about it right now is a sign that the definition is currently in transition/ has not reached consensus.
    What value is there in trying to shut down the consensus-seeking discussion that gave us "computer"? The same logic could be used arguing that "computers" are actually be called "calculators" and why are people still trying to call it a "computer"?
- DidYaWipe a year ago
  
  Does every completely legitimate condemnation of erroneous language have to be whined about by some apologist for linguistic erosion?
- baq a year ago
  
  you stole a term which means something else in an established domain and now assert that the ship has sailed, whereas a perfectly valid term in both domains exists. don't be a lazy smartass.
  https://en.wikipedia.org/wiki/Confabulation
  - criddell a year ago
    
    That's actually what the paper is about. I don't know why they didn't use that in the title.
    > Here we develop new methods grounded in statistics, proposing entropy-based uncertainty estimators for LLMs to detect a subset of hallucinations—confabulations—which are arbitrary and incorrect generations.
  - nsvd a year ago
    
    This is exactly how language works; words are adopted across domains and change meaning over time.
    
    baq a year ago
    
    If there's any forum which can influence a more correct name for a concept it's this one, so please excuse me while I try to point out that contemporary LLMs confabulate and hallucinating should be reserved for more capable models.
- slashdave a year ago
  
  It is exhausting, but so is the misconception that the output of an LLM can be cleanly divided into two categories.
- criddell a year ago
  
  If the meaning was established and well understood, this wouldn't happen in every thread.
  - mistercow a year ago
    
    It’s well understood in the field. It’s not well understood by laymen. This is not a problem that people working in the field need to address in their literature.
    
    criddell a year ago
    
    We're mostly laymen here.
- intended a year ago
  
  The paper itself talks about this, so yes?
stoniejohnson a year ago

All people do is confabulate too.
Sometimes it is coherent (grounded in physical and social dynamics) and sometimes it is not.
We need systems that try to be coherent, not systems that try to be unequivocally right, which wouldn't be possible.
- Jensson a year ago
  
  > We need systems that try to be coherent, not systems that try to be unequivocally right, which wouldn't be possible.
  The fact that it isn't possible to be right about 100% of things doesn't mean that you shouldn't try to be right.
  Humans generally try to be right, these models don't, that is a massive difference you can't ignore. The fact that humans often fails to be right doesn't mean that these models shouldn't even try to be right.
  - mrtesthah a year ago
    
    By their nature, the models don’t ‘try’ to do anything at all—they’re just weights applied during inference, and the semantic features that are most prevalent in the training set will be most likely to be asserted as truth.
    
    Jensson a year ago
    
    They are trained to predict next word that is similar to the text they have seen, I call that what they "try" to do here. A chess AI tries to win since that is what it was encouraged to do during training, current LLM try to predict the next word since that is what they are trained to do, there is nothing wrong using that word.
    This is an accurate usage of try, ML models at their core tries to maximize a score, so what that score represents is what they try to do. And there is no concept of truth in LLM training, just sequences of words, they have no score for true or false.
    Edit: Humans are punished as kids for being wrong all throughout school and in most homes, that makes human try to be right. That is very different from these models that are just rewarded for mimicking regardless if it is right or wrong.
    
    idle_zealot a year ago
    
    > That is very different from these models that are just rewarded for mimicking regardless if it is right or wrong
    That's not a totally accurate characterization. The base models are just trained to predict plausible text, but then the models are fine-tuned on instruct or chat training data that encourages a certain "attitude" and correctness. It's far from perfect, but an attempt is certainly made to train them to be right.
    
    Jensson a year ago
    
    They are trained to replicate text semantically and then given a lot of correct statements to replicate, that is very different from being trained to be correct. That makes them more useful and less incorrect, but they still don't have a concept of correctness trained into them.
    
    shinycode a year ago
    
    Exactly, if a massive data poisoning would happen, will the AI be able to know what’s the truth is there is as much new false information than there is real one ? It won’t be able to reason about it
  - empath75 a year ago
    
    > Humans generally try to be right,
    I think this assumption is wrong, and it's making it difficult for people to tackle this problem, because people do not, in general, produce writing with the goal of producing truthful statements. They try to score rhetorical points, they try to _appear smart_, they sometimes intentionally lie because it benefits them for so many reasons, etc. Almost all human writing is full of a range of falsehooods ranging from unintentional misstatements of fact to out-and-out deceptions. Like forget the politically-fraught topic of journalism and just look at the writing produced in the course of doing business -- everything from PR statements down to jira tickets is full of bullshit.
    Any system that is capable of finding "hallucinations" or "confabulations" in ai generated text in general should also be capable of finding them in human produced text, which is probably an insolvable problem.
    I do think that since the models do have some internal representation of certitude about facts,that the smaller problem of finding potential incorrect statements in its own produced text based on what it knows about the world _is_ possible, though.
- android521 a year ago
  
  It is an unsolved problem for humans .
shiandow a year ago

If you'd read the aticle you might have noticed that generating answers with the LLM is very much part of the fact-checking process.
energy123 a year ago

The answer is no, otherwise this paper couldn't exist. Just because you can't draw a hard category boundary doesn't mean "hallucination" isn't a coherent concept.
- tbalsam a year ago
  
  (the OP is referring to one of the foundational concepts relating to the entropy of a model of a distribution of things -- it's not the same terminology that I would use but the "you have to know everything and the model wouldn't really be useful" is why I didn't end up reading the paper after skimming a bit to see if they addressed it.
  It's why this arena things are a hard problem. It's extremely difficult to actually know the entropy of certain meanings of words, phrases, etc, without a comical amount of computation.
  This is also why a lot of the interpretability methods people use these days have some difficult and effectively permanent challenges inherent to them. Not that they're useless, but I personally feel they are dangerous if used without knowledge of the class of side effects that comes with them.)
scotty79 a year ago

The idea behind this research is to generate answer few times and if results are semantically vastly different from each other then probably they are wrong.
marcosdumay a year ago

> Isn’t it true that the only thing that LLM’s do is “hallucinate”?
The Boolean answer to that is "yes".
But if Boolean logic were a god representation of reality, we would already have solved that AGI thing ages ago. On practice, your neural network is trained with a lot of samples, that have some relation between themselves, and to the extent that those relations are predictable, the NN can be perfectly able to predict similar ones.
There's an entire discipline about testing NNs to see how well they predict things. It's the other side of the coin of training them.
Then we get to this "know the correct answer" part. If the answer to a question was predictable from the question words, nobody would ask it. So yes, it's a definitive property of NNs that they can't create answers for questions like people have been asking those LLMs.
However, they do have an internal Q&A database they were trained on. Except that the current architecture can not know if an answer comes from the database either. So, it is possible to force them into giving useful answers, but currently they don't.
fnordpiglet a year ago

This isn’t true in the way many np problems are difficult to solve but easy to verify.
yieldcrv a year ago

profound but disagree
the fact checker doesn’t synthesize the facts or the topic

caseyy a year ago

Maybe for the moment it would be better if the AI companies simply presented their chatbots as slightly-steered text generation tools. Then people could use them appropriately.

Yes, there seems to be a little bit of grokking and the models can be made to approximate step-by-step reasoning a little bit. But 95% of the function of these black boxes is text generation. Not fact generation, not knowledge generation. They are more like improv partners than encyclopedias and everyone in tech knows it.

I don’t know if LLMs misleading people needs a clever answer entropy solution. And it is a very interesting solution that really seems like it would improve things — effectively putting certainty scores to statements. But what if we just stopped marketing machine learning text generators as near-AGI, which they are not? Wouldn’t that undo most of the damage, and arguably help us much more?

signatoremo a year ago

I’m working with a LLM right this moment to build some front end with react and redux, the technologies that I have almost no knowledge of. I posed questions and the LLM gave me the answers along with JavaScript code, a language that I’m also very rusty with. All of the code compiled, and most of them worked as expected. There were errors, some of them I had no idea what they were about. LLM was able to explained the issues and gave me revised code that worked.
All in all it’s been a great experience, it’s like working with a mentor along the way. It must have saved me a great deal of time, given how rookie I am. I do need to verify the result.
Where did you get the 95% figure? And whether what it does is text generation or fact or knowledge generation is irrelevant. It’s really a valuable tool and is way above anything I’ve used.
- refulgentis a year ago
  
  The last 6 weeks there's been a pronounced uptick in comments, motivated by tiredness of seeing "AI", manifested as a fever dream of them not being useful at all, and swindling the unwashed masses who just haven't used them enough yet to know their true danger.
  I've started calling it what it is: lashing out in confusion at why they're not going away, given a prior that theres no point in using them
  I have a feeling there'll be near-religious holdouts in tech for some time to come. We attract a certain personality type, and they tend to be wedded to the idea of things being absolute and correct in a way things never are.
  - hatefulmoron a year ago
    
    It's also fair to say there's a personality type that becomes fully bought into the newest emerging technologies, insisting that everyone else is either bought into their refusal or "just doesn't get it."
    Look, I'm not against LLMs making me super-human (or at least super-me) in terms of productivity. It just isn't there yet, or maybe it won't be. Maybe whatever approach after current LLMs will be.
    I think it's just a little funny that you started by accusing people of dismissing others as "unwashed masses", only to conclude that the people who disagree with you are being unreasonable, near-religious, and simply lashing out.
    
    refulgentis a year ago
    
    I don't describe disagreeing with anyone, nor do I describe the people making these comments as near-religious, or simply lashing out, nor do I describe anyone as unreasonable
    I reject simplistic binaries and They-ing altogether, it's incredibly boring and waste of everyones time.
    An old-fashioned breakdown for your troubles:
    > It's also fair to say
    Did anyone say it isn't fair?
    > there's a personality type that becomes fully bought into the newest emerging technologies
    Who are you referring to? Why is this group relevant?
    > insisting that everyone else is either bought into their refusal or "just doesn't get it."
    Who?
    What does insisting mean to you?
    What does "bought into refusal" mean? I tried googling, but there's 0 results for both 'bought into refusal' and 'bought into their refusal'
    Who are you quoting when you introduce this "just doesn't get it" quote?
    > Look, I'm not against LLMs making me super-human (or at least super-me) in terms of productivity.
    Who is invoking super humans? Who said you were against it?
    > It just isn't there yet, or maybe it won't be.
    Given the language you use below, I'm just extremely curious how you'd describe me telling the person I was replying to that their lived experience was incorrect. Would that be accusing them of exaggerating? Dismissing them? Almost like calling them part of an unwashed mass?
    > Maybe whatever approach after current LLMs will be.
    You're blithely doing a stream of consciousness deconstructing a strawman and now you get to the interesting part? And just left it here? Darn! I was really excited to hear some specifics on this.
    > I think it's just a little funny that you started by accusing people of dismissing others as "unwashed masses",
    Thats quite charged language from the reasonable referee! Accusing, dismissing, funny...my.
    > only to conclude that the people who disagree with you are being unreasonable, near-religious, and simply lashing out.
    Source? Are you sure I didn't separate the paragraphs on purpose? Paragraph breaks are commonly used to separate ideas and topics. Is it possible I intended to do that? I could claim I did, but it seems you expect me to wait for your explanation for what I'm thinking.
    
    elicksaur a year ago
    
    > I reject simplistic binaries and They-ing altogether, it's incredibly boring and waste of everyone’s time.
    > I have a feeling there'll be near-religious holdouts
    Pick one! Something tells me that everyone you disagree with is blinded by “religious”-ness or some other label you ascribe irrationality to.
    
    hatefulmoron a year ago
    
    >> It's also fair to say
    > Did anyone say it isn't fair?
    No. I don't think I said you did, either. One might call this a turn of phrase.
    >> there's a personality type that becomes fully bought into the newest emerging technologies
    > Who? Why is this group relevant?
    What do you mean 'who'? Do you want names? It's relevant because it's the opposite, but also incorrect mirror image of the technology denier that you describe.
    >> Look, I'm not against LLMs making me super-human (or at least super-me) in terms of productivity.
    > Who is invoking super humans? Who said you were against it?
    ... I am? And I didn't say you thought I was against it? I feel like this might be a common issue for you (see paragraph 1.) I'm just saying that I'd like to be able to use LLMs to make myself more productive! Forgive me!
    >> It just isn't there yet, or maybe it won't be.
    > Strawman
    Of what?? I'm simply expressing my own opinion of something, detached from what you think. It's not there yet. That's it.
    >> Maybe whatever approach after current LLMs will be.
    > Darn! I was really excited to hear some specifics on this.
    I don't know what will be after LLMs, I don't recall expressing some belief that I did.
    > Thats quite charged language from the reasonable referee! Accusing, dismissing, funny...my.
    I could use the word 'describing' if you think the word 'accusing' is too painful for your ears. Let me know.
    > Source? Are you sure I didn't separate the paragraphs on purpose? Paragraph breaks are commonly used to separate ideas and topics. Is it possible I intended to do that? I could claim I did, but it seems you expect me to wait for your explanation for what I'm thinking.
    Could you rephrase this in a different way? The rambling questions are obscuring your point.
    
    refulgentis a year ago
    
    Gotcha! ;)

Animats a year ago

"We show how to detect confabulations by developing a quantitative measure of when an input is likely to cause an LLM to generate arbitrary and ungrounded answers. ... Intuitively, our method works by sampling several possible answers to each question and clustering them algorithmically into answers that have similar meanings."

That's reasonable for questions with a single objective answer. It probably won't help when multiple, equally valid answers are possible.

However, that's good enough for search engine applications.

rbanffy a year ago

The concept of semantic entropy reminds me of a bank, whose name I can't remember, that, in the aftermath of the Enron catastrophe, did make a "bullshitometer" to measure the level of bullshit in press-releases. In that case, they applied it to the Enron press releases before the company's implosion and showed it could have predicted the collapse.

foota a year ago

There's a concept in statistics called sensitivity analysis. It seems like this is somewhat similar, but an alternative approach that might be interesting would be to modify the input in a way that you think should preserve the semantic meaning, and see how that alters the meaning of the output.

Of course, altering the input without changing the meaning is the hard part, but doesn't seem entirely infeasible. At the least, you could just ask the LLM to try to alter the input without changing the meaning, although you might end up in a situation where it alters the input in a way that aligns with its own faulty understanding of an input, meaning it could match the hallucinated output better after modification.

jostmey a year ago

So, I can understand how their semantic entropy (which seems to require a LLM trained to detect semantic equivalence) might be better at catching hallucinations. However, I don't see how semantic equivalence directly tackles the problem of hallucinations. Currently, I naively suspect it is just a heuristic for catching hallucinations. Furthermore, the requirement of a second LLM trained at detecting semantic equivalence to catch these events seems like an unnecessary complication. If I had a dataset of semantic equivalence to train a second LLM, I would directly incorporate this into the training process of my primary LLM

bravura a year ago

I haven’t really grokked this work yet well enough to critique it, but to answer your question:
Yes you could incorporate a semantic equivalence dataset into your training but:
1) when you have a bunch of ‘clear-cut’ functions (“achieve good AUC on semantics”) and you mix them to compensate for the weaknesses of a complicated model with an unknown perceptual objective, things are still kinda weird. You don’t know if you’re mixing them well, to start, and you also don’t know if they introduce unpredictable consequences or hazards or biases in the learning.
2) on a kinda narrowly defined task like: “can you determine semantic equivalence”, you can build a good model with less risk of unknown unknowns (than when there are myriad unpredictable interactions with other goal scoring measures)
3) if you can apply that model in a relatively clear cut way, you also have fewer unknowns unknowns.
Thus, carving a path to a particular reasonable heuristic using two slightly biased estimators can be MUCH safer and more general than mixing that data into a preexisting unholy brew and expecting its contribution to be predictable.
jampekka a year ago

Catching "hallucinations" is quite useful for many applications. I'm doing some research in mitigating effects of factual errors in LLM generated answers for public agencies, where giving a factually wrong answer may be illegal. If those could be detected (with sufficient accuracy), the system could simply decline to give an answer and ask the user to contact the agency.
Training the models not to give wrong answers (or giving them less) in the first place would of course be even better.
Unnecessary complications come also from the use of pre-trained commercial black-box LLMs through APIs, which is (sadly) the way LLMs are used in applications in vast majority of times. These could perhaps be fine tuned through the APIs too, but it tends to be rather fiddly and limited and very expensive to do for large synthetic datasets like would be used here.
P.S. I found it quite difficult to figure out from the article how the "semantic entropy" (actually multiple different entropies) is concretely computed. If somebody is interested in this, it's a lot easier to figure out from the code: https://github.com/jlko/semantic_uncertainty/blob/master/sem...

curious_cat_163 a year ago

It’s a pretty clever idea: “check” if the model answers “differently” when asked the same question again and again and again.

“checking” is being done with another model.

“differently” is being measured with entropy.

caseyy a year ago

This makes sense. Low semantic entropy probably means the answer was more represented in the unsupervised learning training data, or in later tuning. And I understand this is a tool to indirectly measure how much it was represented?

It’s an interesting idea to measure certainty this way. The problem remains that the model can be certain in this way and wrong. But the author did say this was a partial solution.

Still, wouldn’t we be able to already produce a confidence score at the model level like this? Instead of a “post-processor”?

trafalgar_law a year ago

Anyone noticed this: They have already published the same basic idea "Semantic Entropy" in ICLR 2023 (https://openreview.net/forum?id=VD-AYtP0dve), but they did not cite this prior work in their Nature paper. From the submission record, they submitted this Nature paper after their ICLR paper got accepted and published. According to the Nature submission policy regarding conference papers (https://www.nature.com/nature/editorial-policies/preprints-c...), it is clearly stated that "Authors must provide details of the conference proceedings paper with their submission including relevant citation in the submitted manuscript." So this seems a clear-cut violation of Nature policy to me. Any thought?

iandanforth a year ago

The semantic equivalence of possible outputs is already encoded in the model. While it is not necessarily recoverable from the logits of a particular sampling rollout it exists throughout prior layers.

So this is basically saying we shouldn't try to estimate entropy over logits, but should be able to learn a function from activations earlier in the network to a degree of uncertainty that would signal (aka be classifiable as) confabulation.

Havoc a year ago

Won’t this catch creativity too? ie write me a story about a horse. LLMs freestyle that sort of thing quite hard so won’t that look the same under the hood?

Def_Os a year ago

This is a good point. If you're worried about factuality, entropy is generally bad. But creative uses might thrive on it.
- surfingdino a year ago
  
  https://www.storycubes.com/en/ have been around for a while and do not require a huge data centre to create random ideas.
- tliltocatl a year ago
  
  But if you care about factuality, why would you use generative at all, rather than RAG or some old-school fuzzy full-text search? The whole things sounds like "we have a technique (LLM) that gives results once considered impossible, so it must be the magic box that solves all end every problem".

avivallssa a year ago

While I cannot argue about a Specific approach, I could say that hallucinations can only be minimized but with a various level of measures while working with large language models.

As an example, while we were building our AI Chatbot for Ora2Pg, the main challenge was that we used OpenAI and several other models to begin with. To avoid hallucinations to the most possible extent, we went through various levels including PDR and then Knowledge Graphs and added FAQs and then used an Agentic approach to support it with as much as information as possible from all possible contexts.

As it is very challenging for anybody and everybody to build their own models trained with their data set, it is not something possible to avoid hallucination with generic purpose LLM's unless they are trained with our data sets.

The chatbot that we built to avoid hallucination as much as we can.

https://ora2pgsupport.hexacluster.ai/

gmerc a year ago

This seems to do the same as this paper from last year but getting more press.

https://arxiv.org/abs/2303.08896

cubefox a year ago

That does indeed sound very similar.

k__ a year ago

How big of a problem are hallucinations right now?

I use LLMs daily and get crappy results more often than not, but I had the impression that would be normal, as the training data can be contradictory.

danielbln a year ago

I feel since a lot of platforms integrate tool use (e.g. search) its become easier to root out hallucinations by just asking the model to search and validate its own output.

imchillyb a year ago

Lies lie at the center of common discourse.

The trick isn't in how to spot the lies, but how to properly apply them. We cannot teach the AI how not to lie, without first teaching it when it must lie, and then how to apply the lie properly.

"AI, tell me, do these jeans make me look fat?"

AI: NO. You are fat. The jeans are fine.

Is not an acceptable discourse. Learning when and how to apply semantical truth stretching is imperative.

They must first understand where and when, then how, and finally why.

It's how we teach our young. Isn't it?

alliao a year ago

but this thing doesn't die, and why should it imitate our young?
- mistermann a year ago
  
  They are both trained on the same virtual reality.

farceSpherule a year ago

Hallucination is a combination of two conscious states of brain wakefulness and REM sleep.

Computers cannot "hallucinate."

3abiton a year ago

I skimmed through the paper, but don't LLMs most of the time guess, sometimes these guesses contains noise that might be on point or not. I wonder if "confabulation" had a more formal definition.

sn41 a year ago

There seems to be an article on confabulations - seems to be a concept from neuroscience. From the abstract of the article:
"Confabulations are inaccurate or false narratives purporting to convey information about world or self. It is the received view that they are uttered by subjects intent on ‘covering up’ for a putative memory deficit."
It seems that there is a clear memory deficit about the incident, so the subject "makes stuff up", knowingly or unknowingly.
--
cited from:
German E. Berrios, "Confabulations: A Conceptual History", Journal of the History of the Neurosciences, Volume 7, 1998 - Issue 3
https://www.tandfonline.com/doi/abs/10.1076/jhin.7.3.225.185...
DOI: 10.1076/jhin.7.3.225.1855

klysm a year ago

The intersection into epistemology is very interesting

caseyy a year ago

Yes… is knowledge with lower entropy in society more true? That sounds to me like saying ideas big echo chambers like Reddit or X hold are more true. They kind of have a similar low entropy = higher visibility principle. But I don’t think many commonly agreed upon ideas on social media are necessarily true.

more_corn a year ago

This is huge though not a hundred percent there.

lopkeny12ko a year ago

The best way to detect if something was written by an LLM, which has not failed me to date, is to check for any ocurrences of the word "delve."

slater a year ago

https://old.reddit.com/r/ChatGPT/comments/1bzv071/apparently...

DidYaWipe a year ago

[dead]