bbor 5 days ago

Well put! If you/y’all haven’t heard, there’s a popular breakdown of “technical documentation” into four types, and this is one of the axes: https://nick.groenen.me/posts/the-4-types-of-technical-docum...

People’s names for the four types vary, but I’m personally a fan of naming the axes “propositional vs procedural” and “informational vs developmental”, giving us a final four categories (00, 01, 10, 11) of “References”, “Instructions”, “Lessons”, and “Tutorials”. I think the applicability to LLM clearly holds up! Though more so for advanced chatbots than HR widgets TBF, I doubt anyone is looking for developmental content from one of those.

  • shabie 5 days ago

    Thanks a lot for sharing, I have not heard of this before.

niobe 5 days ago

Well put, and separating these would be a good use case for system prompts e.g.

llm -m model --save instructional --system "provide the detailed steps to achieve the outcome, using a suitable example if necessary"

llm -m model --save informational --system "provide a concise conceptual overview but do not provide implementation steps or detailed examples"

  • shabie 5 days ago

    That's actually a pretty interesting point. Not just evals but other components like system prompt should also be tailored to match the expected outcome.

trash_cat 7 hours ago

But then how do you classify a task that the LMM performed, such as a summary? I think you are onto something here but it really depends on what task you want the LMM to perform, search, how to, summary, extraction etc...

rwnspace 10 hours ago

My experience with them doesn't quite fit either: I've primarily used LLMs for giving me hints when I'm struggling with a leetcode problem or similar. They're surprisingly good at it, providing you regularly remind them to provide little clues only.

js8 a day ago

I wish there also was a distinction between truthful and whimsical.

fuzzy_biscuit a day ago

When I was doing SEO full-time, this is one of the ways we used to categorize content - via intent. As a result, my immediate question becomes: how long before those responses start to be subsumed by commercial intent responses? To me, this is an inevitability. A when, not an if.