The reason I had been looking for that Sydney Harris cartoon is that I was putting together a guest lecture for our university's "Responsible Conduct of Research" course. I was speaking today about data management and retention, a topic I've come to know well over the last year through some university service work working on policies in that area. After speaking, it occurred to me that it's not a bad idea to summarize important points on this for the benefit of student readers of this blog. In brief:
- Everything is data. Not just raw numbers or images, but also the final analyzed graphs, the software used to do the analysis, the descriptions of the instrument settings used to acquire the raw numbers - everything.
- The data are the science. The data are the foundation for all the analysis, model-building, papers, arguments, further refinements, patents, etc. Protect the data!
- If you didn't document it, you didn't do it.
- Write down everything. Fill up notebooks. Annotate liberally, including false starts, what you were thinking when you set up the little sub-experiments or trials that go into any major research endeavor. I guarantee, you will never, ever in your life look back and say, "I regret that I was so thorough, and I wish I had written down less." After years of observation, I am convinced that good notebook skills genuinely reduce mean time to thesis completion in many cases. If you actually keep track of what you've been doing, and really write down your logic, you are less likely to go down blind alleys or have to repeat mistakes.
- You may think that you own your data. You don't, technically. In an academic setting, the university has legal title to the data (that gives them the legal authority that they need to adjudicate disputes about access to data, including those that arise in the rare but unfortunate cases of research misconduct), while investigators are shepherds or custodians of the data. Both have their own responsibilities and rights. Some of those responsibilities are inherent in good science and engineering (e.g., the duty to do your best to make sure that the published results are accurate and correct, as much as possible), and others are imposed externally (e.g., federal funding agencies require preservation of data for some number of years beyond the end of an award).
- Back everything up. In multiple ways. With the advent of scanners, digital cameras, cheap external hard drives, laptops, thumbdrives, "the cloud" (as long as it's better than this), etc., there is absolutely no excuse for not properly backing up data. To repeat, back everything up. No, seriously. Have a backup copy at an off-site location, as a sensible precaution against disaster (fire, hurricane, earthquake, zombie apocalypse).
- Good habits are habits, and must be habituated. It took me more than 25 years to get in the habit of really flossing. Do yourself a favor, and get in the habit of properly caring for your data. Please.

No comments:
Post a Comment