I wonder if there's a way to do diffusion within some sort of schema-defined or type constrained space.
A lot of people these days are asking for structured output from LLMs so that a schema is followed. Even if you train on schema-following with a transformer, you're still just 'hoping' in the end that the generated json matches the schema.
I'm not a diffusion excerpt, but maybe there's a way to diffuse one value in the 'space' of numbers, and another value in the 'space' of all strings, as required by a schema:
I'm not sure how far this could lead. Could you diffuse more complex schemas that generalize to a arbitrary syntax tree? E.g. diffuse some code in a programming language that is guaranteed to be type-safe?
A lot of people these days are asking for structured output from LLMs so that a schema is followed. Even if you train on schema-following with a transformer, you're still just 'hoping' in the end that the generated json matches the schema.
I'm not a diffusion excerpt, but maybe there's a way to diffuse one value in the 'space' of numbers, and another value in the 'space' of all strings, as required by a schema:
{ "type": "object", "properties": { "amount": { "type": "number" }, "description": { "type": "string" } }, "required": ["amount", "description"] }
I'm not sure how far this could lead. Could you diffuse more complex schemas that generalize to a arbitrary syntax tree? E.g. diffuse some code in a programming language that is guaranteed to be type-safe?