[OT] Text analysis and author, anyone done it?
Richmond Mathewson
richmondmathewson at gmail.com
Fri Jul 1 04:11:47 EDT 2011
On 07/01/2011 10:27 AM, Peter Alcibiades wrote:
> I do think its possible, and has actually been done successfully. The Bible
> is a difficult case since we don't have value free assessments of
> authorship. Consequently it is reasonable to argue that what the programs
> do is successfully implement the prejudices of their authors.
>
> However, when we apply this to Dickens, and then ask whether the various
> completions of Edwin Drood were completed by him, and we apply it to Jane
> Austen and ask whether the software shows the same person to have written
> the works of Austen and Fanny Burney, we are dealing with definitely known
> authorship, so we can assume that if the algorithms discriminate correctly
> in these cases they will probably work on other material where authorship is
> unknown.
>
> The case which I'm looking to apply this to is a bit more like the literary
> case. There a number of texts of which the authorship is definitely known
> and not subject to dispute. There is then one text whose authorship is
> unknown. The question is whether it is probably by one of the known
> authors.
>
> We do also have a case like the Biblical case - where there are texts under
> one signature that we suspect to have come from more than one author, and
> perhaps from the author of the text of primary interest. It would be nice
> to be able to discriminate between authors in this body of work as well.
I would leave that to Burton Mack:
http://en.wikipedia.org/wiki/Burton_L._Mack
"Quelle" is NOT for computer programmers . . . :)
> Thanks for the very helpful references. Its early days yet, but there is no
> reason why statistical analysis should not illuminate this question, and
> there are some promising leads. Certainty is not to be expected, but
> statistical text analysis is definitely one weapon in the armory.
>
> I've no views on Paul and Hebrews. Whether an author really does write in
> statistically different ways depending on audience, well, its an empirical
> question, haven't come across any studies. Yes, the working assumption in
> the stats is that they don't. One could probably tell by looking at the
> work of some prolific authors who have published in different segments under
> different names. There are such cases.
>
> --
> View this message in context: http://runtime-revolution.278305.n4.nabble.com/Re-Text-analysis-and-author-anyone-done-it-tp3636729p3637660.html
> Sent from the Revolution - User mailing list archive at Nabble.com.
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
More information about the use-livecode
mailing list