[OT] Text analysis and author, anyone done it?
bobs at twft.com
Thu Jun 30 17:42:16 CDT 2011
Since the subject was broached using textual analysis of Biblical passages as an example, I will respond in like kind. If anyone will be offended at this, please, stop reading, close the email, and ignore any future posts to this thread by me. I've given fair warning. Please no flames.
This linked article is a perfect case in point for why software based textual analysis does NOT work well for the purposes that some may intend. Part of the problem is that it relies upon a number of assumptions, human assumptions no less (for all software is really a reflection of the developer), not the least of which is that a single author can have only one style of writing. Paul for example, was one of the most learned Hebrew scholars of his time, as well but was raised in the Grecian society and well schooled in their ways and traditions. When writing to the Gentiles, you will find that his style was distinctly different from the book of Hebrews, where he wrote to his fellow countrymen about what Christianity means to the Jew, to the Torah (the books of the law) and to those Hebrews who had embraced Christianity. The differences have cause no end of disagreement among the learned about who really wrote the book of Hebrews. Some say it was Peter or John, but then the same difficulty arises when comparing Hebrews to other Biblical writings of those authors.
In response to the specific example in the article about the book of Isaiah, Jesus quoted from the two sections of Isaiah, commonly believed by "progressive" critics to be written by two different "Isaiahs", saying in the second quote, "That same Isaiah...". There are some who believe there were 5 or 6 Isaiahs, although this is considered "fringe" even by the majority of the progressive critics.
Even in the mainstream, many critics believe that Isaiah was written AFTER the time of Christ, mainly due to the very specific prophecies about the Christ which are too many to name here. (That is a pretty nice trick, seeing that the Dead Sea Scrolls contain parts that the scholars say came after Christ, but no archaeologist in his right mind would contend the Dead Sea Scrolls are post Christ!)
Now I don't think it takes a genius to see that if Jesus was wrong about anything, then his whole claim to be the Son of God lies in ruins, and we can only pity him as a self deluded fool, or worse yet, a vile deceiver. But if His claims are true, then it is the critics who are to be pitied. Because they are simply and completely wrong.
I can tell you that when I am typing an email of this nature, my grammar and tone morphs into something much more formal and precise. When I email my friends about a funny joke, my writing style is very much different. Software analysis of all I have ever written from my first essay in grade school to now would probably conclude there are 20 of me.
All that to say this: Software may be able to turn up some interesting facets of writing styles, and we may be able to learn something from it, but it is not the software that fails us here. It is humans who put way too much value in the apparent results than are warranted. I guess people cannot help but grab ahold of what they can, so they can believe what they want to believe, on either side of the issue. Software cannot fix that.
On Jun 30, 2011, at 1:01 PM, Ronald Zellner wrote:
> Not intending to get into the specific topic, but the process in a recent article is related to your question:
> "The new software analyzes style and word choices to distinguish parts of a single text written by different authors, and when applied to the Bible its algorithm teased out distinct writerly voices in the holy book."
> "The program, part of a sub-field of artificial intelligence studies known as authorship attribution, has a range of potential applications – from helping law enforcement to developing new computer programs for writers. But the Bible provided a tempting test case for the algorithm’s creators."
> Allison Pollalrd has a PPT on the topic: lyle.smu.edu/~rewini/5-7339/allison.ppt
> Has these references:
> Ephratt, Michal. Authorship attribution - the case of lexical innovations. http://www.cs.queensu.ca/achallc97/papers/p006.html
> Gerritsen, Corey M. Authorship Attribution Using Lexical Attraction. http://genesis.csail.mit.edu/papers/Gerritsen2003.pdf
> Holmes, David I. Stylometry: Its Origins, Development and Aspirations. http://www.cs.queensu.ca/achallc97/papers/s004.html
> Pfleeger, Charles P. and Shari Lawrence Pfleeger. Security in Computing. Pg 342.
> Whether or not the processes can be translated into LC code is a separate issue.
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
More information about the use-livecode