Gender Genie Archives
At about 7:30 a.m. this morning, my sister gave me a call and told me to turn on our local Fox News station (WNYW) to watch their "Web Wednesday" segment. To my complete amazement, the Gender Genie was the lead site featured and made me nearly faint from excitement. So, thanks so much to Fox and...
Welcome Fox 5 News New York viewers!
If you're interested in more information about the algorithm behind the Genie, we've written quite a bit about it. Comments are always welcome, and we hope you enjoy your visit.
Update: Fox has uploaded a video of the segment to their website. Way cool.
Another Perspective on the Gender Genie
The Gender Genie continues to spark interest, and I have just read one of the best blog posts I've seen about it on Customer Experience Crossroads.
Most of the time, people drop their writing into it and, when they don't get the result they expect, declare it to be wrong, wrong, wrong. Yet, a lot of its users still find it and its analysis to be a fun time waster. Despite having written the program, I didn't come up with the algorithm and believe that the Genie works no better than the flip of a coin. However, I don't think it to be a complete time waster since there actually is some academic study that went into it.
In the most basic terms, the computational linguists behind the algorithm, Koppel and Argamon, took a bunch of fiction and looked for trends based on gender. Using complicated formulas, they determined that male writers tended to write more about specific things like an apple, a book, or the car. In contrast, female writers wrote about connections to things like my apple, your book, or our car. The nouns themselves (apple, book, car) didn't matter much but the preceding qualifier, whether an article (a, an, the) or possessive (my, your, our), did.
Although I think you really can't figure out whether a writer is male or female based on writing, I still believe that the linguists' algorithm has useful applications. I have received emails from several authors saying that they have used it to help make their female characters come across as being more female and vice versa. Now, Customer Experience Crossroads sees it as another tool for tailoring marketing to target market: "We don't all communicate in the same way. Worth considering when you think about customer experience."
The Gender Genie has been undergoing something of a revival recently as it is discovered and linked to by those who didn't know about it before last summer's extended break. If you're a new reader here, between May and September 2006 I pretty much let BookBlog rot as I recovered from exhaustion. Teaching fourth grade was already a tiring occupation, and moving to my current home exasperated my situation due to a six-hour round trip commute. The Gender Genie's slapdash code broke down during all this, and I let it wallow until I built up enough strength to fix it.
Personal problems aside, it tickles me to find it mentioned on other sites since its popularity continues to amaze. Following are a few recent links of note, which I place here so I have something to muse on in my golden years.
BBC's Magazine Monitor: "At last! The answer to Paper Monitor's gender can be found at Gender Genie. Based on the last three entries, PM is male. Except on Monday, when we must assume someone filled in for him."
Nerve's Scanner: "We find the Gender Genie interesting, mostly because it’s so BAD. We submitted two recent articles for analysis, including the story we recently wrote about Sarah Silverman (in which we actually write about being a woman), and it told us we were a maaan, baby."
BlogHer: "Most of my blog entries are well under 500 words, so I used a tutorial about Firefox for my second test. I pasted in about half of it, 786 words, and the Genie again declared me male. This time the score was female 732, male 1131. I suddenly feel hair growing on my chest. Testosterone rushes to enlarge my biceps."
John Scalzi's Whatever: "Just to be sure to it doesn't think I'm natively girly, however, I also fed it the first chapter of The Android's Dream, in which, as you know, someone farts someone else to death. The result: The algorithm believes the author of that passage is male."
The Scalzi post is the most notable. Being a popular science fiction author, he naturally has author friends who read and comment on his blog. If you scroll through the comments of the above linked post, you'll see Matt Ruff as an active participant in Scalzi's Gender Genie discussion. Ruff wrote Set This House in Order, the last book we attempted to discuss before BookBlog was silenced last summer. Ruff even says he fed some chapters of Set This House in Order into the Genie and,
I think there's another uber-discussion to be had, about how the toy's underlying algorithm actually works, and whether our responses to it are an example of the Eliza effect.
Wouldn't it be funny if this business about keywords was just a dodge, and it was actually determining the maleness or femaleness of texts by flipping a digital coin?
To respond to Ruff's wondering, the Genie doesn't flip a coin. It really does score keywords based on Koppel and Argamon's text-sexing algorithm. I've never claimed that the program was accurate, but rather think of it as proof of how far society has progressed in equalizing the sexes. Despite the researchers' claim, you can't tell if a writer is a man or a woman, but it is an interesting study in what kind of writing is perceived as male (i.e. concrete) or female (i.e. connective).
This Scalzi/Ruff business also sets off my irony meter. Here's Ruff, author of a BookBlog selection, bringing up an "uber-discussion" about our toy. Yet, we didn't manage to discuss his book because no one read it besides Daisy, the moderator. I have always felt bad about this and, several months ago, put Set This House in Order at the top of a pile near my desk as a reminder to send Daisy a kindly-worded e-mail about reviving the discussion. I'd promise to read it this time.
Hello and welcome to all the visitors from Digg, Gizmodo, Del.icio.us, Dilbert.Blog, and sundry places on the Internet. If you're wondering what's the deal with that Gender Genie thing, here's a link to everything we've written about it:
BookBlog's entries and comments about The Gender Genie
Feel free to leave comments of your own. We love lively discussion.
The Genie Lives...And It's About Time
The Gender Genie is mostly working. After spending what felt like an eternity looking at code followed by banging my head on the desk, it turns out the whole thing was brought down by only five lines of bad parsing. Some of the Genie's functionality is still wonky, mainly the form that handles its statistics, so I've taken those bits offline until I have the strength to look at it again.
If anyone manages to break it, please let me know. This way, I won't delay in finding a bridge and throwing myself over the side. [Oh, I'm kidding. The nearest bridge that's high enough has to be at least a mile away. It'd be much easier to just walk a few blocks to the main road and simply step in front of a bus.]
I've started getting e-mails from lots of people asking about The Gender Genie. Although BookBlog exists mainly as an online book club, the genie drives most of our traffic and sucks up a shocking amount of bandwidth each month. As a result, I think a post about its status and explanation of what happened is in order.
The Gender Genie evolved from a series of events. Andy, a lifetime member and proprietor of Reality Blurred, posted a link to a NY Times article about an algorithm designed to determine gender in writing. Rich, a former member, automated the algorithm in ASP for us to play with. We had a lot of fun, so I asked him if I could borrow his idea for BookBlog. Since ASP doesn't work here, I rewrote the program in PHP and turned it into a GUI by adding a bunch of pretty colors. (No surprise since I am a girl, after all.) Another blogger discovered it, linked to us, and it suddenly turned into a meme. The rest is history.
About two weeks ago, our hosting company migrated the site to a new server and all hell broke loose. The genie's PHP engine suddenly stopped working. A kind user alerted me to the problem, so I took the program down and began debugging. I'm sure the culprit is a single piece of badly written code but I haven't had enough free time to go through several thousand lines of it. Besides fixing the PHP problem, the page design and HTML also need an overhaul since I've done practically no maintenance for more than a year now.
So, when will The Gender Genie be back? I can't make a firm commitment, but I think it will be near the beginning of March. I have a week of vacation coming up and plan to spend it in front of the computer.
I received the following e-mail a while back:
While I was amused that your gender genie got my sex wrong 100% of the time, I feel it is unnecessary for the 'was I right' box to comment 'that is one butch chick.'
I am a happily married mother of two, with no 'butch' features other than a tendancy to use the word 'the' more often than your algorithim thinks I should. I suggest you should consider whether it is your gender definitions which are wrong, and not me.
Handing out personal insults to people who are helping your research is neither funny nor clever. Plus, the phrase 'one butch chick' suggests that you have unresolved gender issues of your own, and therefore your theories may be tainted at source.
Since it is not BookBlog's intent to offend anyone, we would like to publicly apologize to you, sir, for any distress The Gender Genie
may have caused. The Gender Genie is designed to entertain and amuse. To imply that such a masculine individual could possibly a "chick" is degrading, demoralizing, and disgusting. We declare, proudly and enthusiastically, that your testicles are truly the size of cantaloupes. In addition, we share your disappointment in our conscious decision to choose the moment you inputted your text to personally insult you rather than serve up one of the many random canned insults we use to offend everyone else.
We sincerely regret upsetting men everywhere and should be flogged in a public square with a razor-tipped whip soaked in lemon juice as passersby throw rotting cabbages. However, we do take some comfort in knowing we were able to amuse you by being 100% wrong. You may rest assured, my man. We will probably be wrong many, many more times in the future, so prepare yourself for the raucous hilarity and please accept our humblest apologies.
[Yeah, yeah. Not only did our silly game upset this humorless happily married mother of two, but I posted her e-mail and ridiculed it. I am evil. Bite me.]
The Gender Genie Thinks Belle de Jour Is a Man, Baby
Those wacky folks at The Guardian mentioned The Gender Genie again:
Graham Thomson: The Gender Genie thinks your blog was written by a male. Was it?
Belle: It was not. Incidentally, the Gender Genie also thinks Orbyn was written by a male - her photo gallery would indicate not - and Wherever You Are is female - I am reliably informed otherwise.
I might have to add a new category to go along with "fiction," "non-fiction," and "blog" submissions: "London call girl bloggers who write tell-all exposés yet still manage to keep their anonimity, which, let's face it, is probably much more interesting than the actual content of the book."
(Is it me or does Belle de Jour seem wrong? I'm thinking Belle du Jour sounds better. Or maybe I'm just in the mood for soup.)
Just when we thought interest in our Gender Genie was about to die out, Alexander Chancellor of The Guardian wrote a column about it:
Given the Gender Genie's hopeless record in identifying the sex of the Guardian's women columnists, it is tempting to write it off as a piece of rubbish. But it's not quite possible to do that, for its guesses have proven accurate in 72% of cases, which may be less than the 80% claimed, but is quite impressive all the same.
The genie also did a tour of political blogs after getting mentions on InstaPundit, Andrew Sullivan, Dynamist, and National Review Online's The Corner. Of all of them, though, I think I most appreciate the endorsement from Allah Is in the House. Even a higher power enjoys the genie. [Thanks to BarCodeKing for the heads up.]
And, finally, welcome to all the NaNoWriMo participants who have been stopping by, plugging in their works in progress, and reporting their results. Please note, though, that although the Gender Genie will take a guess at the gender of your writing style, it cannot tell you whether or not your novel will be published. Perhaps I should blow the dust off the plans for the Your-Writing-Sucks Genie.
Thanks to a lot of input and new algorithms from Shlomo Koppel and Moshe Argamon, the researchers who inspired our successful program, an updated version of The Gender Genie was launched today. Its features include unique scoring based on genre of text entered and more detailed statistics. So far, it has a better accuracy rate than the previous version and does very well when texts of all genres consist of more than 500 words.
I hope everyone has as much fun playing with it as I have had building it.
BookBlog -- and The Gender Genie -- have been mentioned in the New York Times. The mention comes in a Circuits column's summary of The Gender Genie.
Pamela LiCalzi O'Connell writes that "some cheeky members of an online book discussion site, BookBlog, turned the algorithm into a Web application called Gender Genie...." She also interviews Mary and mentions the ongoing improvements to the Genie.
So, congrats to all involved, especially to Mary who created this place for us to hang out more than a year ago. And, welcome to the NYTs readers who've come to see what we're all about.
Many thanks to Pam O'Connell who made the Gender Genie the lead item in her New York Times column, Online Diary. We're famous!
Since the article was written, our numbers have changed a bit. Thanks to mentions on several high traffic web sites (addictinggames.com, bluesnews.com, and metafilter.com), the genie has analyzed more than 390,000 documents submitted by over 200,000 unique users. Welcome to everyone who has stopped by to try it out.
It's also true as mentioned in the article that I have been working with Koppel and Argamon to improve the genie. The version which had been printed in The New York Times Magazine was a "toy" for readers like us to play with. The actual algorithm is quite complicated, but they have sent me a simplified yet web-friendly formula to help improve the genie's accuracy.
Stay tuned since the update is to be released very soon.
Well. When The Gender Genie was launched, no one here expected it to take off and make its way around the Internet. Thanks to everyone who linked to it, and hope you're having fun taunting your friends with their results.
From reading various sites that link to it, I've noticed that many have taken issue with both its results and stats. It also seems like a lot of people are confusing gender with sex, so I thought I'd write up a post to explain the difference.
Sex, apart from the act of having it, refers to biological or physical traits that determine whether one is a man or a woman. We all know the difference between a penis and a vagina, right?
Gender refers to society's classification of characteristics perceived to be particular to a certain sex. For example, think about humans as hunter-gatherers. Hunting connotes a masculine activity, so your brain might conjure up images of burly men carrying huge rifles and wearing orange vests. But not all hunters are men and a woman who hunts is still biologically a woman. In imagining her, however, you might assign her some masculine traits like being butch or wearing iron-toed boots.
I realize the above gender example leans heavily toward stereotyping, but it gets the point across. Biology determines sex while society assigns gender. To relate this back to The Gender Genie, a woman author whose passage comes up with a male result is seen by Koppel and Argamon's algorithm as having a masculine quality to her writing because she's writing more about specific things (using keywords like "the," "a," "some," numbers, and "it") than connections (using keywords like "with," possessives, possessive pronouns, "for," and "not").
The Gender Genie should really come up with results like "masculine" or "feminine" rather than "male" or "female." However, the former set of terms is highly subjective since gender can be assigned by either society as a whole or individual members of society. If a user puts in a passage by a man and gets "feminine" as a result, the user might think of that man as having feminine qualities and answer yes when asked if the result is correct.
The stats themselves are not to be taken at face value. Their near 50/50 results shows us that determining sex from a writing sample is hit or miss. Determining gender from writing, though, is another matter entirely.
As for all you men who think The Gender Genie is bunk because of your consistent female results, I suggest you stop fighting it and go buy a dress already.
After Andy brought this New York Times Magazine article about gender and word choice to our attention, we (me) here at BookBlog headquarters (my bedroom), decided to test the algorithm by hand-scoring a few passages. We chose a few books off the top of three piles conveniently located right behind us, and conducted our own unscientific survey. We spent an hour typing and counting and adding and subtracting, and discovered that the algorithm correctly predicted the author from our sample of 10 books 50% of the time.
Then, taking a cue from Rich and borrowing his idea for a textual gender predictor, we decided to create a little application of our own:
The Gender Genie
Despite Koppel and Argamon's claim that their algorithm is 80% accurate, our application only manages near 50% just as our hand-scoring did.
Why would we bother to announce a gender-predicting program that's right only half of the time? Well, we find it entertaining. Plus it amuses us when we put in passages written by a man and discover that he writes like a girl. And it's pretty.
can you tell if a writer is a man or a woman?
"Men and women ostensibly write the same language, on the other hand, but according to a recent article in The Boston Globe, they do so in ways that immediately reveal which sex is doing the writing." That's according to Sunday's New York Times Magazine, which reports on research done by scientists who "devised an algorithm that could predict with 80 percent accuracy the sex of the author."
They discovered that "women are apparently far more likely than men to use personal pronouns -- 'I,' 'you' and 'she' especially. Men, on the other hand, prefer so-called determiners -- 'a,' 'the,' 'that,' 'these' -- along with numbers and quantifiers like ''more'' and 'some.'"
Is it truly possible to determine the sex of an author by a mathematical algorithm? If it's true, is this because men and women are so biologically different that even our prose is shaped by our genitalia? Or is this because of we've been socialized so much that masculine and feminine roles affect even our writing? Especially in light of the discussion below, I'm curious to see what you all think.
And: Can you tell the difference? To see if you can, I've created a little test (read on).
The challenge: Guess the sex of the author of the following passages. The answers are at the end, in white. Give it a shot and then post your score. We're all friends, so we trust you not to lie. Ahem.
My methodology: I went to my bookshelves and pile of magazines, selected a handful of texts, mostly at random, and selected, again mostly at random, a few sentences from each one. I only chose a new passage if I thought that the first one I found was too revealing (e.g., you'd recognize the work and thus the author because of character names or details). I also tried to make sure it was representative, and not just a random wacky passage. That said, I realize my methodology is completely nonscientific, and the study above actually examined the whole text, not just a segment.
Just take the test. Are these writers male or female?
- The ape is too distant to be sedulous. All the great novelists like Thackeray and Dickens and Balzac have written a natural prose, swift but not slovenly, expressive but not precious, taking their own tint without ceasing to be common property.
- Misunderstandings tangle like phone cords; perverse emotions simmer beneath neutral banter. But IMing can be oddly hypnotic as well. As long as the chat box remains onscreen, a psychic connection continues even if neither participant says anything at all.
- Dorothy put her right hand on Cara's belly. She was carrying high, which tradition said meant the baby was a boy, but this had nothing to do with Dorothy's certainty of the child's sex. She just had a feeling.
- Black America and white America still live separately. Most whites live in overwhelmingly white neighborhoods; most blacks live in majority-black ones. Americans of different races still tend not to live together, socialize together, or chart their paths in this society together.
- Time to escape. I want my real life back with all of its funny smells, packets of loneliness, and long, clear car rides. I want my friends and my dopey job dispensing cocktails to leftovers. I miss heat and dryness and light.
- I just kept quiet and looked around. And I noticed things. The dots on the ceiling. Or how the blanket they gave me was rough.
- I had an inspiration once. I woke up one morning and I knew that today I had to swallow fifty aspirin. It was my task: my job for the day.
- I knew he was near, because in the candlelight I could see blood scattered in the dust around my bed and there was a red handprint on the sheets. I guessed he was in the shadows at the other end of the longhouse, waiting to loom out and surprise me.
- In the mystic offices to which such things were put, there was something that quickened his imagination. For these treasures, and everything that he collected in his lovely house, were to be to him means of forgetfulness, modes by which he could escape, for a season, from the fear that seemed to him at times to be almost too great to be borne.
- A fire walker with steel rods through his cheeks had predicted the year would end in disaster, the islands would be laid waste by a curse. Educated Fijians had laughed at his prediction, shrugging off the odd cyclone and shark attack.
The answers (highlight -- click and hold your mouse as you drag over the line -- to read)
- female: Virginia Woolf, A Room of One's Own.
- female: Emily Nussbaum, "Fast Company," Radar Magazine.
- male: Michael Chabon, "Son of the Wolfman."
- female: Farai Chideya, Don't Believe the Hype.
- male: Douglas Coupland, Generation X.
- male: Stephen Chbosky, The Perks of Being a Wallflower.
- female: Susanna Kaysen, Girl, Interrupted.
- male: Alex Garland, The Beach.
- male: Oscar Wilde, The Picture of Dorian Gray.
- female: Kiana Davenport, "Fork Used in Eating Reverend Baker."
So, how'd you do?
After trying these, are you more or less convinced of the scientist's argument and findings? What did you find yourself looking for to determine whether the passage was written by a man or a woman? What parts mislead you on the ones you got wrong? What parts were giveaways on the ones you got right?