the idea is that instead of using JSON.parse, we create a custom Type.parse for each type you define.
so if you want a:
class Job { company: string[] }
And the LLM happens to output: { "company": "Amazon" }
We can upcast "Amazon" -> ["Amazon"] since you indicated that in your schema.https://www.boundaryml.com/blog/schema-aligned-parsing
and since its only post processing, the technique will work on every model :)
for example, on BFCL benchmarks, we got SAP + GPT3.5 to beat out GPT4o ( https://www.boundaryml.com/blog/sota-function-calling )
so if you want a:
class Job { company: string[] }
We can upcast "Amazon" -> ["Amazon"] since you indicated that in your schema.
Congratulations! You've discovered Applicative Lifting.but applicative lifting is a big part of it as well!
gloochat.notion.site/benefits-of-baml
Client: Ollama (phi4) - 90164ms. StopReason: stop. Tokens(in/out): 365/396
---PROMPT---
user: Extract from this content:
Grave Digger:
Ingredients
- 1 1/2 ounces vanilla-infused brandy*
- 3/4 ounce coffee liqueur
- 1/2 ounce Grand Marnier
- 1 ounce espresso, freshly brewed
- Garnish: whipped cream
- Garnish: oreo cookies, crushed
Steps
1. Add all ingredients into a shaker with ice and shake until
well-chilled.
2. Strain into a coupe.
3. Top with whipped cream and crushed Oreo cookies (discarding cream in
center).
*Vanilla-infused brandy: Cut 2 fresh vanilla pods lengthwise and place
into a 750 mL bottle of brandy. Let sit for 3 to 5 days, shaking
occasionally. Strain out pods and store.
IngredientType
----
- Alcohol
- Sweetener
- Sour
- Aromatic
- Bittering_agent
- Food
- Dilution
LiquorType
----
- Gin
- Vodka
- Rum
- Whiskey
- Tequila
- Mezcal
- Brandy
- Cognac
- Liqueur
- Wine
- Absinthe
Answer in JSON using this schema:
{
Name: string,
Ingredients: [
{
Text: string,
Type: IngredientType,
Liquor_type: LiquorType or null,
Name_brand: string or null,
Unit_of_measure: string,
Measurement_or_unit_count: string,
}
],
Steps: [
{
Number: int,
Instruction: string,
}
],
Garnishes: string[],
}
---LLM REPLY---
```json
{
"Name": "Grave Digger",
"Ingredients": [
{
"Text": "vanilla-infused brandy*",
"Type": "Alcohol",
"Liquor_type": "Brandy",
"Name_brand": null,
"Unit_of_measure": "ounces",
"Measurement_or_unit_count": "1 1/2"
},
{
"Text": "coffee liqueur",
"Type": "Liqueur",
"Liquor_type": "Liqueur",
"Name_brand": null,
"Unit_of_measure": "ounce",
"Measurement_or_unit_count": "3/4"
},
{
"Text": "Grand Marnier",
"Type": "Liqueur",
"Liquor_type": "Liqueur",
"Name_brand": "Grand Marnier",
"Unit_of_measure": "ounce",
"Measurement_or_unit_count": "1/2"
},
{
"Text": "espresso, freshly brewed",
"Type": "Bittering_agent",
"Liquor_type": null,
"Name_brand": null,
"Unit_of_measure": "ounces",
"Measurement_or_unit_count": "1"
}
],
"Steps": [
{
"Number": 1,
"Instruction": "Add all ingredients into a shaker with ice and shake until well-chilled."
},
{
"Number": 2,
"Instruction": "Strain into a coupe."
},
{
"Number": 3,
"Instruction": "Top with whipped cream and crushed Oreo cookies (discarding cream in center)."
}
],
"Garnishes": [
"whipped cream",
"oreo cookies, crushed"
]
}
```
---Parsed Response (class Recipe)---
{
"Name": "Grave Digger",
"Ingredients": [
{
"Text": "vanilla-infused brandy*",
"Type": "Alcohol",
"Liquor_type": "Brandy",
"Name_brand": null,
"Unit_of_measure": "ounces",
"Measurement_or_unit_count": "1 1/2"
},
{
"Text": "espresso, freshly brewed",
"Type": "Bittering_agent",
"Liquor_type": null,
"Name_brand": null,
"Unit_of_measure": "ounces",
"Measurement_or_unit_count": "1"
}
],
"Steps": [
{
"Number": 1,
"Instruction": "Add all ingredients into a shaker with ice and shake until well-chilled."
},
{
"Number": 2,
"Instruction": "Strain into a coupe."
},
{
"Number": 3,
"Instruction": "Top with whipped cream and crushed Oreo cookies (discarding cream in center)."
}
],
"Garnishes": [
"whipped cream",
"oreo cookies, crushed"
]
}
Processed Recipe: {
Name: 'Grave Digger',
Ingredients: [
{
Text: 'vanilla-infused brandy*',
Type: 'Alcohol',
Liquor_type: 'Brandy',
Name_brand: null,
Unit_of_measure: 'ounces',
Measurement_or_unit_count: '1 1/2'
},
{
Text: 'espresso, freshly brewed',
Type: 'Bittering_agent',
Liquor_type: null,
Name_brand: null,
Unit_of_measure: 'ounces',
Measurement_or_unit_count: '1'
}
],
Steps: [
{
Number: 1,
Instruction: 'Add all ingredients into a shaker with ice and shake until well-chilled.'
},
{ Number: 2, Instruction: 'Strain into a coupe.' },
{
Number: 3,
Instruction: 'Top with whipped cream and crushed Oreo cookies (discarding cream in center).'
}
],
Garnishes: [ 'whipped cream', 'oreo cookies, crushed' ]
}So, yeah, the main issue being that it dropped some ingredients that were present in the original LLM reply. Separately, the original LLM Reply misclassified the `Type` field in `coffee liqueur`, which should have been `Alcohol`.
the model replied with
{
"Text": "coffee liqueur",
"Type": "Liqueur",
"Liquor_type": "Liqueur",
"Name_brand": null,
"Unit_of_measure": "ounce",
"Measurement_or_unit_count": "3/4"
},
but you expected a
{
Text: string,
Type: IngredientType,
Liquor_type: LiquorType or null,
Name_brand: string or null,
Unit_of_measure: string,
Measurement_or_unit_count: string,
}there's no way to cast `Liqueur` -> `IngredientType`. but since the the data model is a `Ingredient[]` we attempted to give you as many ingredients as possible.
The model itself being wrong isn't something we can do much about. that depends on 2 things (the capabilities of the model, and the prompt you pass in).
If you wanted to capture all of the items with more rigor you could write it in this way:
class Recipe {
name string
ingredients Ingredient[]
num_ingredients int
...
// add a constraint on the type
@@assert(counts_match, {{ this.ingredients|length == this.num_ingredients }})
}
And then if you want to be very wild, put this in your prompt: {{ ctx.output_format }}
No quotes around strings
And it'll do some cool stuff
I was trying to make a decent cocktail recipe database, and scraped the text of cocktails from about 1400 webpages. Note that this was just the text of the cocktail recipe, and cocktail recipes are comparatively small. I sent the text to an LLM for JSON structuring, and the LLM routinely miscategorized liquor types. It also failed to normalize measurements with explicit instructions and the temperature set to zero. I gave up.