I've been trying to do the same thing with my commit messages. It's just me for the moment, but knowing the why along with the what is a goldmine when you have to come back to some code 6 months later. Even better, vim has a git-commit mode to help remind me of the conventions.
Currently, it doesn't seem so - most of the functions have their volatility declared as stable, not immutable, so you can't simply create an index on the function without getting an erorr. I think long term the postgresql guys plan on supporting special GiST indexes for JSON, but that didn't make it into 9.3.
I imagine you could create a wrapper function marked immutable that calls the provided json accessor functions, but I'm not really clear on the implications of doing so.
Is there any reason to believe that you would not? I would be very surprised if there was any "pure" expression in PostgreSQL on which you couldn't have an index over comparison.
EDN is natively extensible to more types, for example, the dozens of types postgres supports[1] which actually makes EDN significantly more suitable for this use case. JSON can't encode the difference between a string, a date, a timestamp, and a uuid -- they're all encoded as strings and consuming code needs to go out of band to understand the type.
the stuff about EDN being natively supported by clojure is a red herring and irrelevant.
In the general case, it has the low startup cost of a document db combined with a clear path to scalability with a future relational model.
In the specific case, it removes the data transformation code required to convert the clojure data structures I have to sql, json, or xml that postgresql understands.
So the same argument as (say) YAML, Lua tables or TNetStrings. Or, if you include binary representations, stuff like Thrift, XDR and ASN.1.
You can embed a JVM in PostgreSQL and write the functions yourself (or use PL/Scheme I suppose), but I would be basically amazed if anyone decides to do it for you.
JSON gets the nod because it's understood by billions of systems. Other formats are going to struggle.
Are there some or other kinds of indexes on JSON data fields? The thing is I would not need any kind of JSON manipulation unless it includes INDEX assisted queries. Doing linear lookups is pretty stupid even if it includes fancy accessors and validators for field formatting. I am far from being considered expert on the topic .. but there is no mention of calculated indexes on JSON formatted fields...
-- are there indexes that assisted queries against JSON datastructures stored in JSON type fields?
Can you elaborate on the scaling issues? Being able to query into JSON fields (including functional indexes) is a great helper for denormalization which is very good for actually increasing scalability.
For me, the native JSON support is a very handy tool to have in the toolbelt for parts of our application that have a very loose schema.
Well your database is a black box technically speaking. It's very hard to scale it horizontally and it is very expensive to scale it vertically as time goes on.
Logic suggests that you should keep as much processing functionality outside something which can't be scaled cheaply or easily and push it to cheaper front end servers.
On this basis, anything which implies more work than collecting and shifting the data over the wire shouldn't really be in the database. Parsing / processing JSON is one of those things that's going to eat CPU/memory.
Fundamentally there's nothing wrong with storing JSON inside the database and processing it externally, but processing it inside the database is a big risk.
I've seen the same thing over the years with XML in the database and more recently people adding CLR code to SQL Server stored procedures.
Hmm, I'd have to disagree with you on that. It is true that JSON processing could reduce raw scalalability in the sense that a query that uses JSON may be slower than one without. However, having JSON processing in the database simplifies quite a bit. For instance, imagine an app that processes and emits JSON and also uses a normalized database. This functionality now makes it possible to move some of the JSON processing closer to the database. In some cases, this may not be the best idea, and in others, it's a win. As with many things, having the choice is not the problem; it's how one chooses, given the choice. I can see the benefits of XML in the database, but I can also see how it could be misused. The key, as always, is to apply judicious thought to the problem.
Actually people USE JSON blob fields to increase scalability (as far as back as FriendFeed).
Sure, doing this hampers normalization and single server speed, but if the queries are parallelizable with shards et al, what would hamper scalability?
This is an absolutely beautiful commit message
Many of the presentations on using plv8 to access JSON in Postgres 9.2 (e.g., https://wiki.postgresql.org/images/b/b4/Pg-as-nosql-pgday-fo... ) show the use of functional indexes, but they create a plv8 function marked as immutable.
I imagine you could create a wrapper function marked immutable that calls the provided json accessor functions, but I'm not really clear on the implications of doing so.
So now it is indeed possible!
[0] https://github.com/edn-format/edn
the stuff about EDN being natively supported by clojure is a red herring and irrelevant.
[1] http://www.postgresql.org/docs/8.4/static/datatype.html
In the specific case, it removes the data transformation code required to convert the clojure data structures I have to sql, json, or xml that postgresql understands.
You can embed a JVM in PostgreSQL and write the functions yourself (or use PL/Scheme I suppose), but I would be basically amazed if anyone decides to do it for you.
JSON gets the nod because it's understood by billions of systems. Other formats are going to struggle.
RFC6901: JSON Pointer
RFC6902: JSON Patch
-- are there indexes that assisted queries against JSON datastructures stored in JSON type fields?
I only say this because I've been there, with SQL Server's XML processing stuff, then spent nearly 2 years getting rid of it.
For me, the native JSON support is a very handy tool to have in the toolbelt for parts of our application that have a very loose schema.
Logic suggests that you should keep as much processing functionality outside something which can't be scaled cheaply or easily and push it to cheaper front end servers.
On this basis, anything which implies more work than collecting and shifting the data over the wire shouldn't really be in the database. Parsing / processing JSON is one of those things that's going to eat CPU/memory.
Fundamentally there's nothing wrong with storing JSON inside the database and processing it externally, but processing it inside the database is a big risk.
I've seen the same thing over the years with XML in the database and more recently people adding CLR code to SQL Server stored procedures.
Sure, doing this hampers normalization and single server speed, but if the queries are parallelizable with shards et al, what would hamper scalability?