elfs: (Default)
[personal profile] elfs

I can’t tell if I’m suffering from the geek equivalent of writer’s block, analysis paralysis, overload, or some combination of all of the above.

I’ve been working on a generalized platform for narrator, the story engine for the Journal Entries and everything else. As I’m working my way through the process, I’m thinking in typical RDBMS fashion, “Okay, the system has Authors and an Author has Serials and a Serial may have more Serials which eventually leads to having Stories.” (To make the whole “Serials” thing clearer: A universe has novels; novels have chapters. “Universe” and “Novel” are synonyms for Serial, and “Chapter” is a synonym, in this system, for “story”– an atomic block of text readable in one sitting. Theoretically.) I have a tree-based heirarchy for Serials working, although I’m damnably unhappy with where in the system the knowledge for “An author may have zero or more serials” is stored.

The actual architecture is sorta backwards: Each story has a foreign key to a serial, and a serial may have a foreign key to a parent, and ultimately all serials have a foreign key to a user, and so forth; the usual one-to-many relationships built within SQL. The nice thing about Django is that it notices these things and builds the backing querysets automagically, so you can traverse either way along this structure without having to write your own code.

And I pretty much have all that put together. I have some performance worries, and have seriously contemplated what this would work like under Mongo, but so far, it seems to be fine.

Time to upload my stories to the system, right? Four “Universes,” one merely labeled “Other Stories.” And I’m thinking, well, I could do it by hand– yeah, all 400 stories or so– or I could fake it just spew the data from the old database, through a transform, to the new one– fast, but misses the point– or, the choice I made, I could write a simple web-based API.

“Simple.”

It turns out that there are three mysteries to web-based APIs. First, there is RESTful verb architecture, meaning: do you encode the verb in the URL, or do you go with the HTTP verbs of PUT and DELETE along with the more traditional GET and POST?

RESTful verb architecture rigidifies the interaction with your system in strange ways. Consider my description above as a web page; a web designer would build a limited view of the author’s profile data, plus a tree-based view of the author’s serials and the stories they contain. Now, if your URL for an author is, say, “/author/<author_id>/”, it might make sense in an API to provide a complete view of the author profile and a complete list of his works, or it might make sense to have two two calls– one for profile, one for his works– but there’s no real way to structure the system in such a way that the URL refers some of the author’s information, and a tree of serials and stories. Those are four different relationships, referring to three different objects, one of which recurses.

This interacts with the other complete problem: how do you return data to the client? As… what? JSON? XML? YAML? A blob of text? Remember that one of the nifty miracles of narrator was its ubiquity: you could download the stories highly styled, or simplified for text-to-speech (I seem to have a lot of blind readers), or as text, or as a PDB Docbook format appropriate for ebook readers. Once the system has decided what action is going to be taken on the resource, that resources has to be rendered and returned to the client.

All of this, when dealing with a framework like Django or Rails, also seems to create the headache that they expect the REST commands to correspond to row actions on a table, which is not what’s going on here. An author isn’t just a row entry in the User table; it’s also a resource with other resources. Serializing it isn’t a simple matter of taking the row, the column names, and adding them together into a dictionary.

All of this comes with one surprisingly enormous pressure, from the Django community: DRY. Don’t Repeat Yourself. If you do, yur doin’ it wrong, as you Intarwebfolk like to say. (I am aware of all Internet traditions.) So when I start to write a generic method for handling the XML, YAML, and JSON versions, and have a “special” version for handling HTML, I worry that I’m repeating myself and the anxiety is crippling.

I should just say “Fuck it,” neh? Just write a stupid intake for “add a serial to a serial,” and another for “add a story to a serial,” and live with it. It might even become canon, and if I ever publish this thing, I’ll have people cursing me for using POX. Do everything in JSON (it’s what I know best) and worry about extensions later.

Oh, yeah, there was a third complication: Authentication.  That mystery is less difficult, but still annoying.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Good 3rd Normal Form

Date: 2009-07-04 06:41 am (UTC)
From: (Anonymous)
Sounds like you have a simple and good 3rd Normal Form RDBMS schema, and then you started letting it get complicated.

What I would do, rather than try to reinvent the wheel from scratch, is initially inquire of Lazeez at StoriesOnLine how he structures his site. Last I recall he's using MySQL to manage hundreds of authors, thousands of stories in both serial and non-serial forms, story universes and serials, multiple download formats, and ancillary features like voting, individual user bookmarks to stories, favorites lists, limitations on daily downloading by non-premier members, story tags, searching on a variety of criteria, author blogs, user authentication, and other minutia. Lazeez has been friendly about discussing such ideas in the past and it costs little to ask.

You might also know people at some other major story sites who are willing to share info. Use that as a starting point to take what works for you.

Of course, if the system is big (I don't think yours is at the moment) and performance is paramount then one can consider the modern post-RDBMS Big Table type solutions.

Or you could throw it into the Cloud. :^)

--DB_Story

Re: Good 3rd Normal Form

Date: 2009-07-04 04:01 pm (UTC)
From: [identity profile] elfs.livejournal.com
I've already talked to everyone I can about this. Besides, this isn't really something I think I'd want to talk too much about at this point; if I do this correctly, I might try putting Lazeez out of business.

Oh, and the base application already runs on Google AppEngine. :-)

Date: 2009-07-04 10:35 am (UTC)
From: [identity profile] cadetstar.livejournal.com
I know you're not primarily developing this in Rails, but if I understand your situation correctly, then serving the data in Rails is just a matter of:

@author = Author.find_by_keyname(params[:keyname])
@serials = @author.serials
for i in @serials
  if !i.subord_serials.empty?
    @serials << i.subord_serials
  end
end


Then serve how it needs to be. The serving can be done via a server-side serving (create the files, then send_file to the client), or done in the view code itself.

-Michael

Date: 2009-07-04 04:00 pm (UTC)
From: [identity profile] elfs.livejournal.com
That's exactly it, but there are a number of niggling details that drive me to distraction: how do I choose the rendering format? In the case of the author<->serial<->story relationship, containing both in a SOAP or POX object is not supported by the standard so few tools exist to help me with that.

It's more a matter of the number of choices I have, and the "correct" way to do it, that has me up all night.

I did have an inspiration last night, an abstraction of the serial<->story relationship called "works", that would allow for a simple meta-database handle. I'll implement it after I figure out (and publish) a "quick and dirty guide to HTTP Digest Authentication Using Django."

Date: 2009-07-04 02:00 pm (UTC)
From: [identity profile] lucky-otter.livejournal.com
Regarding your note about how Django just wants to dump some things into a hash, I hate the tendency in ORMs to sometimes treat your data as first class objects and sometimes want to go straight to the backing tables.

It makes some things much harder than they should be. I'm not sure how many hours I've spent working around that, but it's too many.

Date: 2009-07-05 12:59 am (UTC)
From: (Anonymous)
Mildly off topic, and an apology, but I'm glad I'm not the only blind reader of your stories... or one among few even.

My theory is that blind people want porn/erotica/whatever like anyone else. The main way of getting this, given that I lost my ability to see visual porn, was to go to text stories on the internet. In essence I came for the sex, and stayed for the stories and their innate quality as stories.

In general though us blindies are fairly happy with HTML, which fits nice and neatly with sighted users wanting to read the stories directly from the web browser on just about any platform you care to mention. Some blind people can get overly enthusiastic, "give them an inch and they take a mile" type thing, so watch yourself with that. There is nothing wrong with HTML, if they can browse the site to find the stories to download then they can read it online. With the exception of the free reader Thunder (which I would suggest people swap to the better and free NVDA) at any rate.

Profile

elfs: (Default)
Elf Sternberg

May 2025

S M T W T F S
    123
45678910
111213141516 17
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 1st, 2025 02:04 pm
Powered by Dreamwidth Studios