Fwd: OT: Resources for Data Base Design

Devin Asay devin_asay at byu.edu
Tue May 11 11:25:55 EDT 2010


Sivakatirswami,

I sent your message on to a colleague who is an expert in text markup schemes and this was his reply. It may imply a slightly different direction from what you have started with.

HTH

Devin

Begin forwarded message:

> From: Jarom McDonald 
> Date: May 11, 2010 8:42:40 AM MDT
> To: Devin Asay <devin_asay at byu.edu>
> Subject: Re: OT: Resources for Data Base Design
> 
> Hi Devin,
> 
> At least in the world of academia, what he's looking for just isn't done. Whether for philosophical reasons, common practice reasons, or whatever, there is very little work done in decomposing texts for relational models. Rather, texts are kept whole and marked up in XML, which to most people preserves the complexity of the text and facilitates publishing and dissemination.
> 
> This isn't to say that relational models can't be useful; I have seen products where texts are marked up in the TEI schema (the standard for XML encoding of text) and then elements are chopped up and put into a DB; however, you can achieve similar levels of performance with an XML database. The most used is called eXist; there are plenty of scripts you can find by googling TEI + eXist that can help in storing XML docs in the XML database, querying with xQuery and XPath to find documents, creating indices, etc.
> 
> Of course, this probably doesn't help much, as Revolution has native support for RDBMs but not for XML databases. But for full texts, the relational route just isn't used in academia on any sort of wide scale.
> 
> Jarom
> 
> On Mon, May 10, 2010 at 9:36 AM, Devin Asay <devin_asay at byu.edu> wrote:
> Jarom,
> 
> This came over the Revolution mail list. Any recommendations I could point him to? The last long paragraph details what he's looking for.
> 
> Devin
> 
> 
> Begin forwarded message:
> 
>> From: Sivakatirswami <katir at hindu.org>
>> Date: May 8, 2010 7:53:08 PM MDT
>> To: How to use Revolution <use-revolution at lists.runrev.com>
>> Subject: OT: Resources for Data Base Design
>> Reply-To: How to use Revolution <use-revolution at lists.runrev.com>
>> 
>> I'm working on a content management database based on the Dublin Core 
>> and the Media Annotation Initiative. Much of the whole mode of discourse 
>> and terms translate well into a database scheme but when the discourse 
>> starts to talking about fine tuning and switches to an RDF framework it 
>> is difficult to grok in terms of translating some of the principles into 
>> actual table-field structures in a PostGreSQL dbase. the Dubline Core 
>> seems in some respects a very abstract realm... but things are different 
>> where rubber hits the road.
>> 
>> I've looked pretty closely at the databases generated by XOOPS, Drupal 
>> and Word Press and frankly, they are freaky scary. I see a hodge podge 
>> of strategies, each differing -- depends on whose design the module 
>> whose tables you are looking at. That's why I want to stay with Dublin 
>> Core where the "human readability" principle is kept in the forefront of 
>> design.  I'm pretty close to designing a schema that I think can contain 
>> pretty much all the metadata for any video, text or audio, translations 
>> pamphlets etc. FAQ  that we have. I supposed we are re-inventing the 
>> wheel a bit, but in the end we will get something that is a good match 
>> for our needs and we will not be boxed into framework of a monster CMS 
>> that we cannot customize without spending huge $ on PHP-module 
>> consultants... (been there, done that, nightmare)
>> 
>> Metadata for a video or a sound file or an image is simple enough....
>> 
>> The part of the data base I'm unable to finish of is that which deals 
>> with text fragments.  I think I posted this before on this list but got 
>> no responses. If anyone knows what would be the best list or group I 
>> should go to, to get help, let me know. What I'm interested in should be 
>> pretty standard stuff in the world of academia: e.g. if you want a data 
>> base to contain the most atomic elements of a text resource (one record 
>> for every single verse of every single poem from a book where the poems 
>> are divided into chapters and the chapters into sections and the 
>> sections into parts of a book, and the book is one volume in a 
>> series...)   what is the best schema which allows you to query the data 
>> base to re-aggregate all those elements into it's original source 
>> document, run time (or on a cron or periodically post modifications)  
>> AND OR what other approaches might better serve  the end game (be able 
>> to query for a single verse with complete citation; be able to query for 
>> an entire poem with citation; be able to query for a complete chapter of 
>> poems with a citation ... etc.)  I have some solutions in mind, and I 
>> may just proceed with those, and refactor later if something better 
>> comes along...but I would love to hear from some experts and seem some 
>> existing models.
>> 
>> Any ideas of where to go looking for mangos?
>> 
> 

Devin Asay
Humanities Technology and Research Support Center
Brigham Young University




More information about the use-livecode mailing list