(Pictures by Dave McKean from The Wolves in the Walls by Neil Gaiman)

Thursday, June 29, 2006

enormous datasets suck*

I finally went through and checked that all my raw data was in the right place so that I can start to analyse it. This, and not the analysis itself, is actually what I've been avoiding doing for three months. I was kind of expecting that it would only take me a few minutes and then I'd feel silly for putting it off for so long. It turned out to be almost as painful as I thought, though. I feel somewhat satisfied, since I wasn't dreading it for nothing, but it's not going to make it any easier to avoid procrastinating on the next batch...

*I originally had the much pithier title, "data sucks", but after sitting in a class where a student (still known to my friend and I as "data girl") spent half an hour trying to convince the head of a department that data was singular, I cringe every time I see/hear it used that way (especially when I'm the one misusing it). Data suck seems wrong, though, because it's not that each individual datum sucks, separately they're really very cool; it's only when a few hundred of them get together that they start sucking. I think data implies that I'm talking about a big, enormous dataseta, anyway, because nobody talks about data if they only have 5 of them, do they? So, basically I'm whining that I shouldn't have to spell it out. (I think I'm extending Wednesday for the whole week)

11 Comments:

At 3:13 PM, Blogger Dr. Brazen Hussy said...

Excellent solution to the ever present data as plural problem!

 
At 4:32 PM, Blogger luckybuzz said...

I know pretty much nothing about this type of data or analysis or whatever, but this:

"separately they're really very cool; it's only when a few hundred of them get together that they start sucking."

--is how I feel about people in general. :)

 
At 6:27 PM, Blogger BrightStar (B*) said...

yay for tackling something you've been avoiding! YAY! Inspiring... maybe I'll tackle something I've been avoiding, too.

 
At 7:03 PM, Blogger Lucy said...

oh, and I may have got around the data problem this time, but now I'm embarrassed by my terrible semi-colon usage.

 
At 7:05 PM, Blogger Lucy said...

Hey, where'd my other comment go? It'll probably show up after I retype it...

Luckybuzz, me too :)

B*, I'm not sure it's inspiring when the only reason I'm doing it is because I'm supposed to be talking about the wonderful results that are (hopefully) contained within the data.

 
At 8:17 PM, Blogger post-doc said...

I feel like I spend some weeks just moving data, changing formats, making sure it's in the right spot at the right time. It takes time and energy and motivation, so you should feel proud that you made some progress. Yay for you!

 
At 10:50 AM, Anonymous Anonymous said...

I hope your talk goes well - it's today, right? Analyzing data totally throws me into a panic.

 
At 11:48 AM, Blogger betty said...

lucy - you're my giant-data-set analysis hero! the spreadsheet you sent me for 96-384 conversion has been so useful to me that everytime i open it i want to email you and tell you how wonderful you are.

hope you got through unscathed and that the presentation goes well.

 
At 6:37 PM, Anonymous Anonymous said...

I hope you survived the talk!! How'd it go?

 
At 11:03 PM, Blogger Suz said...

Actually, lots of people talk about data when they only have 5 of them, or sometimes even when they only have 1 of it, and even of the "1 if it," is nothing but a technical artifact. That is what our group's weekly lab meetings are all about - pointless pontification and overanalyzing limited data. Yay molecular biology!

 
At 11:04 PM, Blogger Suz said...

oops.. made a tpyo: above, it should read "even if the '1 of it' is nothing..."

 

Post a Comment

<< Home