Comment by simonw - Hacker Neue

simonw Oct 5, 2025 parent

The best performing isn't markdown tables, it's markdown key/value pairs:

  ## Record 1
  
  ```
  id: 1
  name: Charlie A0
  age: 56
  city: New York
  department: Operations
  salary: 67896
  years_experience: 7
  project_count: 1
  ```

Which makes sense to me because the problem with formats like CSV and regular markdown tables is that it is too easy for the model to mistakenly associate a value in a row with the wrong header.

Explicit key/value formats like this or YAML or JSON objects make that a lot less likely.

cwmoore Oct 5, 2025

I was surprised that XML (56%), with closing tags, wasn’t as good as YAMl/KV(60%), though line breaks perform the same kind of grouping function.

Then I realized from the table that XML used about 50% more tokens (~75K vs ~50K) for similar accuracy, and for the first time felt a kind of sympathy for the LLM…

svachalek Oct 6, 2025

Yeah that was my intuition as well. I think the KV-Markdown format gains additional advantage over JSON and YAML in the special syntax for headers helping to break up records.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous