Preferences


Damn that is how you write a git commit message.
An other fine example — from the same history — was the commit fixing the security issue 2 weeks ago: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitd...

This is an absolutely beautiful commit message

I've been trying to do the same thing with my commit messages. It's just me for the moment, but knowing the why along with the what is a goldmine when you have to come back to some code 6 months later. Even better, vim has a git-commit mode to help remind me of the conventions.
So, can these be mixed with functional indexes to index on json fields?
Currently, it doesn't seem so - most of the functions have their volatility declared as stable, not immutable, so you can't simply create an index on the function without getting an erorr. I think long term the postgresql guys plan on supporting special GiST indexes for JSON, but that didn't make it into 9.3.

Many of the presentations on using plv8 to access JSON in Postgres 9.2 (e.g., https://wiki.postgresql.org/images/b/b4/Pg-as-nosql-pgday-fo... ) show the use of functional indexes, but they create a plv8 function marked as immutable.

I imagine you could create a wrapper function marked immutable that calls the provided json accessor functions, but I'm not really clear on the implications of doing so.

And it turns out that marking many of the JSON functions stable was an oversight and was corrected in git a few days ago: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;...

So now it is indeed possible!

Is there any reason to believe that you would not? I would be very surprised if there was any "pure" expression in PostgreSQL on which you couldn't have an index over comparison.
Now, if it only had EDN[0] support too.

[0] https://github.com/edn-format/edn

Serious question: why?
EDN is natively extensible to more types, for example, the dozens of types postgres supports[1] which actually makes EDN significantly more suitable for this use case. JSON can't encode the difference between a string, a date, a timestamp, and a uuid -- they're all encoded as strings and consuming code needs to go out of band to understand the type.

the stuff about EDN being natively supported by clojure is a red herring and irrelevant.

[1] http://www.postgresql.org/docs/8.4/static/datatype.html

In the general case, it has the low startup cost of a document db combined with a clear path to scalability with a future relational model.

In the specific case, it removes the data transformation code required to convert the clojure data structures I have to sql, json, or xml that postgresql understands.

So the same argument as (say) YAML, Lua tables or TNetStrings. Or, if you include binary representations, stuff like Thrift, XDR and ASN.1.

You can embed a JVM in PostgreSQL and write the functions yourself (or use PL/Scheme I suppose), but I would be basically amazed if anyone decides to do it for you.

JSON gets the nod because it's understood by billions of systems. Other formats are going to struggle.

Well, when you say it that way, I guess I must be wrong to want things that simplify my life as a developer.
It would be lovely if they could add support functions for:

RFC6901: JSON Pointer

RFC6902: JSON Patch

Are there some or other kinds of indexes on JSON data fields? The thing is I would not need any kind of JSON manipulation unless it includes INDEX assisted queries. Doing linear lookups is pretty stupid even if it includes fancy accessors and validators for field formatting. I am far from being considered expert on the topic .. but there is no mention of calculated indexes on JSON formatted fields...

-- are there indexes that assisted queries against JSON datastructures stored in JSON type fields?

Ugh another thing I really don't want in the database. Seriously, stuff like this will knacker your scalability over time.

I only say this because I've been there, with SQL Server's XML processing stuff, then spent nearly 2 years getting rid of it.

Can you elaborate on the scaling issues? Being able to query into JSON fields (including functional indexes) is a great helper for denormalization which is very good for actually increasing scalability.

For me, the native JSON support is a very handy tool to have in the toolbelt for parts of our application that have a very loose schema.

Well your database is a black box technically speaking. It's very hard to scale it horizontally and it is very expensive to scale it vertically as time goes on.

Logic suggests that you should keep as much processing functionality outside something which can't be scaled cheaply or easily and push it to cheaper front end servers.

On this basis, anything which implies more work than collecting and shifting the data over the wire shouldn't really be in the database. Parsing / processing JSON is one of those things that's going to eat CPU/memory.

Fundamentally there's nothing wrong with storing JSON inside the database and processing it externally, but processing it inside the database is a big risk.

I've seen the same thing over the years with XML in the database and more recently people adding CLR code to SQL Server stored procedures.

Hmm, I'd have to disagree with you on that. It is true that JSON processing could reduce raw scalalability in the sense that a query that uses JSON may be slower than one without. However, having JSON processing in the database simplifies quite a bit. For instance, imagine an app that processes and emits JSON and also uses a normalized database. This functionality now makes it possible to move some of the JSON processing closer to the database. In some cases, this may not be the best idea, and in others, it's a win. As with many things, having the choice is not the problem; it's how one chooses, given the choice. I can see the benefits of XML in the database, but I can also see how it could be misused. The key, as always, is to apply judicious thought to the problem.
Actually, these functions seem to support designs where the database is normalised, but front ends are emitting JSON.
Actually people USE JSON blob fields to increase scalability (as far as back as FriendFeed).

Sure, doing this hampers normalization and single server speed, but if the queries are parallelizable with shards et al, what would hamper scalability?

For some reason this website wants me to run a Java applet. Nope. I'm going to flag this submission and maybe a mod can shed some light into this.
I hope you flag everything with Flash, too.
Better start flagging JavaScript as well.
Now we are talking!

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal