Hey DW fiction fans!
Nov. 13th, 2012 05:27 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Do you like Doctor Who fan fiction? Do you like telling folks about the awesome stories you find? Do you frequent A Teaspoon and an Open Mind and make liberal use of their bookmarks and favorites buttons? Well,
calufrax is looking for folks who would like to sign up for a week to spotlight four to seven of their favorite fics housed on the archive. Go here to read more and sign up!. Ad if you're not familiar with Teaspoon, the
calufrax Tag List is a great place to start finding gems.
calufrax recc-ers make an effort to highlight stories across all genres and eras from the show.
Don't believe me?
There are 1854 tags in
calufrax with a label of "doctor:1" through "doctor:11". Also, on the Teaspoon main site there are 29403 stories labeled in the eras "First Doctor" through "Eleventh Doctor". Here is the data by era:
Columns are the count and percentages of tags in the archive, the count and percentages of tags in the recs, the proportion of stories in each era that were recced*, and two different expected numbers of tags for the recs in
calufrax, based on a simple random sampling of the archive (rec.exp), and an "ideal" recs comm where each era is equally represented and thus recs are uniformly distributed across eras (unif.exp).
Now, a completely fair and stratified reccs community might aim for equal representation among all eras in the recs. That's an ideal but is not quite fair when the archive itself is not fully balanced. But some trends are evident. Eras 9 and 10 make up 72.6% of the comm but only 41.8% of the recs. You can see that in the observed frequencies of recs tags, the earlier eras, while not achieving perfect balance, are certainly over-represented relative to their frequency in the archive, borrowing from the New Who eras of Nine and Ten which, while highly represented in the archive, are at least more balanced in the recs. I also made a picture.

Chisquare tests of the observed frequencies against null hypothesis of balanced according to archive frequency and balanced according to uniform frequency both resulted in p-values of 0 (eg, X^2 values of over 1000 on 10 degrees of freedom in both cases. resounding rejection of null). This corroborates what is obvious in the graph. Not so obvious is that even taking out the outliers of eras Nine and Ten, the resulting chisquare test of uniformity across the remaining eras still results in a rejection (X^2 = 45 on 8 df, p = 3x10^[-7] ). I think that this may be because people are forgetting the awesomness of Two and substituting in the more recent awesomeness of Eleven**, and because people remember that Eight is very pretty.
It is also kind of cool how there is a little bump in the archive for 4 and 5 that is reflected somewhat in the recs pattern as well. I think at least just from an exploratory look at the data, that Three's contingent of recc-ers is quite good at getting him noticed. Also he has the added bonus of covering UNIT as well. I can also conclude that Six's era seems to have a very large footprint, with the highest percentage of stories recc-ed in the category. This means that Six writers are awesome.***
[*Astute readers will point out that stories may have multiple tags. I agree, but I believe it is true for both the archive and the recs site. There may be a "Multi-era" effect here, but there are only ~2900 tags in Multi-Era which is only 10% of the data set, so let's assume for now that multi-era distribution of Eras is comparable to the straight tags (actually I would think it would generally up the count across all Doctors being that multi-way would be chosen when you have lots of different eras in a single story, so imagine that the red line may not quite be as severe as it is depicted on the chart).]
[** I am likely guilty of this and must rectify it in later stints.]
[*** Full disclosure in the interests of Science, I write Six from time to time, thus I am also awesome]
Yes, okay then. Should you feel inclined, go sign up to rec on
calufrax!
![[livejournal.com profile]](https://www.dreamwidth.org/img/external/lj-community.gif)
![[livejournal.com profile]](https://www.dreamwidth.org/img/external/lj-community.gif)
![[livejournal.com profile]](https://www.dreamwidth.org/img/external/lj-community.gif)
Don't believe me?
There are 1854 tags in
![[livejournal.com profile]](https://www.dreamwidth.org/img/external/lj-community.gif)
era archive.tag pct.archive rec.tag pct.rec pct.era.rec rec.exp unif.exp 1 521 0.018 93 0.050 0.179 33 169 2 490 0.017 83 0.045 0.169 31 169 3 641 0.022 127 0.069 0.198 40 169 4 1027 0.035 140 0.076 0.136 65 169 5 1045 0.036 116 0.063 0.111 66 169 6 426 0.014 94 0.051 0.221 27 169 7 625 0.021 121 0.065 0.194 39 169 8 894 0.030 157 0.085 0.176 56 169 9 4568 0.155 224 0.121 0.049 288 169 10 16784 0.571 550 0.297 0.033 1058 169 11 2382 0.081 149 0.080 0.063 150 169
Columns are the count and percentages of tags in the archive, the count and percentages of tags in the recs, the proportion of stories in each era that were recced*, and two different expected numbers of tags for the recs in
![[livejournal.com profile]](https://www.dreamwidth.org/img/external/lj-community.gif)
Now, a completely fair and stratified reccs community might aim for equal representation among all eras in the recs. That's an ideal but is not quite fair when the archive itself is not fully balanced. But some trends are evident. Eras 9 and 10 make up 72.6% of the comm but only 41.8% of the recs. You can see that in the observed frequencies of recs tags, the earlier eras, while not achieving perfect balance, are certainly over-represented relative to their frequency in the archive, borrowing from the New Who eras of Nine and Ten which, while highly represented in the archive, are at least more balanced in the recs. I also made a picture.

Chisquare tests of the observed frequencies against null hypothesis of balanced according to archive frequency and balanced according to uniform frequency both resulted in p-values of 0 (eg, X^2 values of over 1000 on 10 degrees of freedom in both cases. resounding rejection of null). This corroborates what is obvious in the graph. Not so obvious is that even taking out the outliers of eras Nine and Ten, the resulting chisquare test of uniformity across the remaining eras still results in a rejection (X^2 = 45 on 8 df, p = 3x10^[-7] ). I think that this may be because people are forgetting the awesomness of Two and substituting in the more recent awesomeness of Eleven**, and because people remember that Eight is very pretty.
It is also kind of cool how there is a little bump in the archive for 4 and 5 that is reflected somewhat in the recs pattern as well. I think at least just from an exploratory look at the data, that Three's contingent of recc-ers is quite good at getting him noticed. Also he has the added bonus of covering UNIT as well. I can also conclude that Six's era seems to have a very large footprint, with the highest percentage of stories recc-ed in the category. This means that Six writers are awesome.***
[*Astute readers will point out that stories may have multiple tags. I agree, but I believe it is true for both the archive and the recs site. There may be a "Multi-era" effect here, but there are only ~2900 tags in Multi-Era which is only 10% of the data set, so let's assume for now that multi-era distribution of Eras is comparable to the straight tags (actually I would think it would generally up the count across all Doctors being that multi-way would be chosen when you have lots of different eras in a single story, so imagine that the red line may not quite be as severe as it is depicted on the chart).]
[** I am likely guilty of this and must rectify it in later stints.]
[*** Full disclosure in the interests of Science, I write Six from time to time, thus I am also awesome]
Yes, okay then. Should you feel inclined, go sign up to rec on
![[livejournal.com profile]](https://www.dreamwidth.org/img/external/lj-community.gif)
no subject
Date: 2012-11-13 11:32 pm (UTC)no subject
Date: 2012-11-14 03:00 am (UTC)I hope you get lots of sign-ups! I will sign up for a bit later since I just got done a few weeks ago.
no subject
Date: 2012-11-14 12:39 am (UTC)no subject
Date: 2012-11-14 03:04 am (UTC)*Inter-Ocular Accuracy. You may laugh but my advisor is a big proponent of this one.
no subject
Date: 2012-11-14 03:15 am (UTC)no subject
Date: 2012-11-14 03:20 am (UTC)no subject
Date: 2012-11-14 03:30 am (UTC)no subject
Date: 2012-11-14 03:32 am (UTC)Actually I'm all for exploratory data analysis. There is much that can be discerned through careful plots.
no subject
Date: 2012-11-14 03:41 am (UTC)I am beginning to think I forgot most of what I learned in math stats. Like, I don't even really know what "exploratory data analysis" means. Although to be fair, I think chi-squares and all that would have been the next course in the prob/stats sequence, and I took the more generic intro stats like, two years ago and so it's kind of mindwiped. :/
no subject
Date: 2012-11-14 03:45 am (UTC)no subject
Date: 2012-11-14 03:50 am (UTC)no subject
Date: 2012-11-14 12:57 pm (UTC)no subject
Date: 2012-11-14 01:49 am (UTC)no subject
Date: 2012-11-14 03:08 am (UTC)no subject
Date: 2012-11-14 03:56 am (UTC)LOL. Well my recs are...hmmmm...sorry. Got distracted by your icon, lol!
GUH.
*SQUISHES YOU*
Have another Eight icon
no subject
Date: 2012-11-14 01:01 pm (UTC)no subject
Date: 2012-11-14 03:46 pm (UTC)*Grins*
no subject
Date: 2012-11-14 08:50 am (UTC)also, i signed up to rec before i saw this post.
no subject
Date: 2012-11-14 12:55 pm (UTC)It is basically saying that even the eras that make up a very small portion of the archive are not ignored by the recc-ers. They make up a larger proportion of recs than one might expect, given their archive proportion. So eg, Six is the smallest era in the archive with only 1.4% of tags in the archive, but he's represented in 5.1% of the recs. So that is a way to show that these earlier eras are spotlighted in the recs comm. (The chi-square tests in this case are overkill; the graph obviously shows that the blue dots do not follow either the red or the green pattern.)
Ten/Rose is great, sure, but if it made up as much of the recs as it does the archive, then the comm would be a lot less diverse.
no subject
Date: 2012-11-14 01:06 pm (UTC)as a side note - i wonder whether six is recced more because people who write six tend to be good writers (i think this is so, actually - which is not a totally unbiased statement as a writer of six talking to a writer of six), people like/are encouraged to rec a variety of eras, people like/are encouraged to rec gen or a variety of pairings, and six fics are more likely to be gen than any one form of pairing, or that classic who people are more likely to sign up to calufrax (or to repeatedly sign up to calufrax).
no subject
Date: 2012-11-14 01:08 pm (UTC)no subject
Date: 2012-11-14 01:37 pm (UTC)10 9 | 11 5 4 | 8 3 7 | 1 2 6
and in the recs
10 9 | 8 11 4 | 3 7 5 | 6 1 2
8 and 5 have the biggest jumps in rank; the only ones that leave their bins for the next higher up or lower down. I could do some bootstrapping tests to associate some margin of error with those ranks.... or I could work on the paper that I have to present friday. ;)
no subject
Date: 2012-11-14 01:48 pm (UTC)but looking at my own recs...
eight: 5
six: 3
another era: 3
three: 2 (and a brig/liz)
ten: 2
eleven: 2
four: 2 (but one isn't tagged!)
five: 1
two: 1
ten: 1
seven: 1
one->ten: 1
and five's one of my favourites. and the only fic i recced of his was one about his celery and not really about him. well, am i reccing in two week's time or aren't i?
no subject
Date: 2012-11-14 02:18 pm (UTC)Oh no! This invalidates my results!
Okay, maybe not.
For the older doctors I've definitely given more recs as parts of multi-era fic. I think the thing with Five is that a lot of the fic for him in the archive is pairings: Five/Tegan, Five/Turlough or Five/Master. Because I would guess that along with Eight, he would be one of the more popular doctors being written before the new series aired, so in a sense he took on a more traditional "fanfic role"? I can imagine for example, that pre-new-series, Five/Tegan would be the old Ten/Rose.
My distribution looks like:
Ten: 20
Eleven: 15
Eight: 7
Nine: 6
Five: 6
Seven: 5
Four: 5
Six: 4
One: 4
Three: 4
Two: 3
Other Era: 5
There are also stories that are not tagged with the Doctor-as-character (which is what those "doctor: x" tags are), like
Ah the perils of examining your data! More research is needed! ;D
Edited to provide rank order
no subject
Date: 2012-11-14 02:28 pm (UTC)i agree with your conclusions about five, though! and i guess, apart from fitz, there's not really anyone obvious to pair eight with, and there has never been much EDA fic at all. but then why does he have so much fic that is worth reccing???
if you want to follow this through, you could also say whether time written effected the kind of fic written - but it would be very difficult, as all of mine say they were written in 2011-ish, and they weren't...
no subject
Date: 2012-11-14 01:21 pm (UTC)See, this is why graphs are cool, because they help us develop questions and hypotheses that theoretically we could design measurements or experiments or surveys to help answer them. I rarely get past the "look at the pretty picture" stage though. :D
no subject
Date: 2012-11-14 01:37 pm (UTC)i do think audios are better for character voice than tv episodes, definitely. fewer distractions.
books is a bit trickier, as i felt it was easier to get a voice right (or just imagined i did) while i was in books-only fandoms, like harry potter and tamora pierce than i now do in doctor who, my only tv fandom.
but it doesn't follow that reading the doctor who books will work the same way as books-only... books.
the fact that doctor who is primarily a tv fandom means that the books will always be a slightly paler imitation of an actual person speaking (audios, obviously, are not).
so they're never an entirely accurate representation of how that character, as played by (for example) patrick troughton would talk. except targets, i suppose. but even then if you don't remember exactly how patrick troughton said a thing then it might be more difficult to accurately replicate how he'd say something else similarly.
as for the original stories books - talking strictly about the new series here, as it gets complicated with seven and particularly eight -i haven't found the writer's takes on the doctor's voice to be very consistent, even when they're an otherwise good writer. like - listening to david tennant read the... one about the pirates (resurrection casket?) made me think - you haven't done a very good job of this, justin richards, because it doesn't sound like the doctor even when the doctor is actually reading this aloud.
thoughts. i have them.
no subject
Date: 2012-11-14 09:35 am (UTC)Also that
Poor Two!
no subject
Date: 2012-11-14 01:07 pm (UTC)And yes, poor Two! We should have a viewing party or something with all the cool Two stories and remind everyone of the awesomeness of Two & Jamie, Zoe, Victoria and the rest. Encourage more folks to read and write Two. :D
I wonder if we could do something like a "121 Ficathon: Eleven prompts, eleven eras" And have like a big table with rows as Doctors and columns as prompts or tropes or what-have-you; then people could sign up for each square they wanted to fill. It would then result in a giant matrix of awesome.
no subject
Date: 2012-11-14 01:33 pm (UTC)The ficathon idea sounds very cool.
As does making everyone write more Second Doctor fic! I haven't written enough myself. There are some good writers out there for him, but he does get neglected - I was looking for one for him a couple of days ago for this week (think I've found one now) and noticed how many are actually multi-era fics, or the same authors. (Other classic Doctors have similar problems too.) And Ben and Polly. Everyone should write more Ben and Polly. :-)