Hacker Newsnew | past | comments | ask | show | jobs | submit | hitchdev's commentslogin

>The issue here is that schema validation is expressed in Python. The author contradicts himself when he argues, on the one hand, that Python shouldn't be used for configuration because it's too powerful: https://hitchdev.com/strictyaml/why-not/turing-complete-code... and on the other hand, that Python is really powerful for building schemas: https://hitchdev.com/strictyaml/why-not/json-schema/ .

Disclosure: I'm the author.

The difference is because full validation/parsing is a task that can rarely be always fully accomplished with JUST a non-turing complete schema. Every time I use JSON schema I have to add additional validation on top written in turing complete code.

This happened to me literally just an hour ago when I wanted to put a DSL in a field in a config file. json-schema (the "config" schema) doesn't let me write code to validate this and reject it. It's a string or it's not. With StrictYAML schemas written in code it's pretty straightforward to create a parser/validator that rejects invalid DSL with a meaningful error.

You might argue that "these rules bolted on top aren't part of the schema" or "this is validation that you can do after the json schema validates" but there is benefit to combining them - namely, code coherence and validation error consistency.

(there are also down sides - namely that json schema can be used in multiple languages. strictness comes at the expense of reusability).

In practice almost every schema I build I want to have stricter validation rules that are not enforceable with something like json-schema alone.

These are both instances of the law of least power. There are plenty of languages which are too powerful for the task at hand and plenty which are not powerful enough and people hack around and even rage against both. There are other "goldilocks" languages that are just right for the task at hand.


> This happened to me literally just an hour ago when I wanted to put a DSL in a field in a config file. json-schema (the "config" schema) doesn't let me write code to validate this and reject it.

You can embed DSLs in CUE. It's a bit unwieldy because you have to essentially reproduce the DSL grammar in CUE, and it may not be performant, but yeah, it's doable. Can you provide more details?

> You might argue that "these rules bolted on top aren't part of the schema" or "this is validation that you can do after the json schema validates" but there is benefit to combining them - namely, code coherence and validation error consistency.

I would argue that it's a slippery slope. Consider v1 where an enum is statically defined as Employee or Manager. Then in v2 we add VP and CEO. Then in v3 actually the list of permitted titles needs to be fetched from a database populated by HR. Is it still correct to put this in configuration validation? What if the person writing the configuration doesn't have permissions to read from HR's database? So nothing should work?


>You can embed DSLs in CUE

CUE lets you embed functions too it looks like it's almost a programming language itself.

The closer a configuration language gets to a programming language the less of a reason I see for it to exist.

>I would argue that it's a slippery slope. Consider v1 where an enum is statically defined as Employee or Manager. Then in v2 we add VP and CEO. Then in v3 actually the list of permitted titles needs to be fetched from a database populated by HR. Is it still correct to put this in configuration validation?

No, coupling to a database would be bad design IMO, but grabbing those enums from other config files in the same folder that are parsed earlier I have done a lot.

I've also used libraries that provide lists of timezones and country codes as enums and plugged them in to the parser so you couldnt invent your own country code.

And Ive written validators that reference other bits of the config (e.g. the list of permitted titles is in another part of the config).

All of these things I would argue are good and useful and not worth sacrificing in exchange for preventing possible misuse (like coupling parsers to a DB).

I actually wrote this parser in the first place because I wanted to create a good metalanguage for tersely defining strongly typed executable specifications in YAML (i.e. Gherkin done right). Tons of stuff I wanted to strictly validate wouldnt have been possible with config-based schema validation and with YAML's weak, implicit typing it was a fucking mess.


I run single script that does a commit, pull and merge without pulling open the editor. It takes a couple of seconds and works fine.


Sure, but that's not really a great/feasible/possible workflow on mobile.


It is. I run a very similar script on termux, triggered via termux widget on my home screen.


I wrote a command line tool called "orji" coz of this (still in alpha).

I wanted to use orgzly or a text editor and just be able to write templates (in jinja2) or bash scripts to push data in or out of my notes.


>you sort of can't make standalone ELisp apps (or at least nobody does). Otherwise you'd just have an `org2html` and `org2pdf` application at the command line.

Because I don't like emacs and I wanted to lean on command line apps for getting data into and out of my orgzly org-mode electronic brain, I've been doing this by combining orgmode with jinja2 (I do this with a cli app I wrote called 'orji').

I think it's a very powerful combination. I use it to generate latex/html/whatever and short bash scripts which are then run. E.g.

* Generate my CV using a latex CV template I found on overleaf that I converted into a jinja2 template.

* Send the contents of a TODO note tagged 'sendemail' as an email with a tiny templated bash script.

* Create jira ticket with the details of a note.

* Generate reveal.js presentations.

All from my phone, using orgzly and one big button which runs a single termux script that seeks out TODO notes with labels (e.g. cv/sendemail) that match the scripts in my library (cv.sh / sendemail.sh).

jinja2 certainly has the capability to end up creating a huge old mess if you abuse it, especially to generate code, but I find that using it to template tiny 5-6 line bash scripts or a latex or an HTML file hits a sweet spot of usability and flexibility.


A link to the documentation¹ and repo² to save others a few steps. Examples are helpfully linked toward the end of the page and its source in the README.

¹ https://hitchdev.com/orji/

² https://github.com/crdoconnor/orji


I dont think it is about discipline. Discipline is required if you're duplicating tedious work, not for creativity.

At its core, a good test will take an example and do something with it to demonstrate an outcome.

That's exactly what how to docs do - often with the exact same examples.

Logically, they should be the same thing.

You just need a (non turing complete) language that is dual use - it generates docs and runs tests.

For example:

https://github.com/crdoconnor/strictyaml/blob/master/hitch/s...

And:

https://hitchdev.com/strictyaml/using/alpha/scalar/email-and...


No, you just need to both understand how your system works and then clearly write down what it's doing and why. If projects like Postgres and SQLite and musl libc and the Linux kernel can all do it, I think the CRUD app authors can do it, too. But it's not magic, and another tool won't solve it (source: I've seen a hundred of these tools on scores of teams, and they don't help when people have no clue what's happening and then they don't write anything down).


"[docs] will take an example and do something with it to demonstrate an outcome."

No. Good docs will explain the context and choices made and trade-offs and risks and relations etc. All the things you can't read from the code. API docs can to a great degree be auto-generated, but not writing the _why_ is the beginning of the end.


It's not purely about skill, code quality also improves as a function of discipline, a willingness to take risks and outside pressure.

fwiw I dont think Ive ever seen clean code that didnt make use of something at the very least resembling both exhaustive CI/CD and TDD. There are some practices which are basically essential even if some of the mavens of "best" practices mistakenly label a few that arent necessary as necessary.


Can you define and measure "code quality?" If not how can you say it improves?

Even granting your point, the things you list that supposedly improve code quality -- discipline, taking risks -- come under the "skills" heading in my understanding.

> I dont think Ive ever seen clean code that didnt make use of something at the very least resembling both exhaustive CI/CD and TDD

Yet almost all of the code ever written, including very successful long-lived things like Unix, the C standard library, Oracle, etc. got written before CI/CD and TDD. I don't have a definition of "clean code" to judge by, but I have certainly worked on lots of readable and maintainable code that did not come from TDD or CI/CD processes.


The best way to conceptualize hexagonal is as a kind of crutch to accomodate the inability of unit tests to effectively fake stuff like the db and their tendency to tightly couple to everything.

It's not intrinsically good design but it does improve unit testability (which sometimes has value and sometimes has zero value).


>And stuff like mocks etc take far longer to write than a 5 line fix

If you test from the outside in and build up a library of functional and realistic fakes then over time then this gets quicker and easier.

Ideally I think people shouldnt use mock objects at all, ever - only fakes of varying realism at the edge of the project, populated with realistic data.

One reason for doing TDD is that it compels you to match a realistic test with a realistic scenario. I tend to find people lose that when they do test after, and they instead lock the test in to the current behavior of the code instead. This is not just tedious work, it's also harmful.


Thats interesting. You had almost exactly the same idea as me: https://hitchdev.com/hitchstory


Yeah I didn’t came up with the idea first, at my work we used an in house solution[1] but it had lots of shortcomings (hard to read output, hard to debug tests, virtually no documentation, and the code was very hard to extend and fix). My plan is to fix all these issues and more :)

[1]: https://github.com/ovh/venom


Literate programming shouldnt necessarily apply to application code for large systems but I almost always want more literate tests that clearly explain not just what should happen under all kinds of scenarios but why it does that. Even better if I had hyperlinked explanations for domain specific terms.

If I always had this I think I'd become fully productive on large systems 2x as quickly.

I dont think this need be achieved by hiring a unicorn coder who is also a great writer but by making tests more accessible and editable by good technical writers so that both parties can collaborate around behavioral tests that double as documentation.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: