Downverse workflow for research - Miao Yu

Firstly, I really appreciate Yihui’s post about me.

Today I want to share some tips with people in academia about how to reduce repeated works in research. You might treat this as an extension for Jeek Leek’s excellent book: How to be a mordern scientist.

What?

Which kind of works could be treated as repeated works in research? A lot! However, unless you experiments are all in silico, I just want to focused on the parts after you finished your experiment. Repeated works in data analysis, writing and presentation should be considered.

Why?

Doing the same thing multiple times would waste your time. Also such behavior would introduce variances to reproduce your results. If we could not find a robot, a better choice is using codes. An important principle in research is that what you could do should be totally reproducible by other people, and yourself. When you do something related to research, make sure other people could always repeat your operation from the recipe you left. You should always treat yourself as ‘other people’ and such codes or text files would help you reduce the future energy.

How?

Use Graphical User Interface(GUI)?

Nooooooooo! No one would remember whether certain buttons have been clicked or not. However, you could also use macro to track your operations if possible. Make sure such macro could be shared without certain requirements about expensive software. If you have to use GUI, try to avoid the interactive operations which could affect your results.

Version control?

Yes! Always use version control for your files and add comments about changes. Use words which could be understood by human beings. Otherwise prepare a code book for yourself. Trust me, you need such code book rather than any other people. Git and github would be a good start.

RSS?

Yes! Try to create your own rss list and keep track of new papers in passive way. Always start with abstract and end with a note links to original paper. Organize the notes as a book related to your research topic. When a new paper appeared as a note in your knowledge system or books, you could finally use such paper. Avoid guiding by the authors and keep your own system updated as I did for metabolomics here.

Plot/Process similar data with the same setting?

Try to organize scripts in one function and use rmarkdown to make reproducible reports about it. Next time you only need to call that functions as I did here.

Make it clear?

Use tidyverse style or verse with tinytex support. Try to keep style stable and other people with similar style would figure out what you said from your codes in one second.

Market your ideas?

Write journal papers with rticles and show them at conferences with xaringan and/or shiny. All you need to learn is markdown and you could graduate in five minutes at most.

Try to share source files about your ideas as a github repo and this would help you spread your ideas as meme. Also you should consider pre-print sever such as arxiv, bioRxiv and chemrxiv to share your ideas in Physics, Life science and Chemistry, respectively.

SNS your ideas?

Write blog posts with blogdown in plain text and share them on twitter/linkedin/researchgate. Actually, one of the purposes of my website is share my ideas.

Similar topic of a bunch of functions?

Pack the functions into a package and make detailed documents and publish with pkgdown as I did in enviGCMS package.

Similar topic of your ideas?

Write a book online with bookdown as I did for metabolomics here

Reproduce the whole environment for data analysis?

Use docker or rocker image to pack all but least dependence or software in a Linux image and share them on dockerhub. You could also distribute your raw data and scripts with your docker image and show the links on your publications. In this case, anyone could validate your results with the same setting. I also did one for metabolomics study.

Share the data?

Try figshare or Open Science Framework to share your data or projects. I still suggested to use docker image to avoid potential issues.

Literature management?

Zotero or fulltext. However, as I shown here, all you need for literature management is DOI. Just build your own knowledge system and organize literature according to your topic. You would benefit from such system by reducing a lot of time to align literature into the sections of your papers since you have already done this at the very beginning.

I can’t remember those tips

Well, I prepare a downverse rocker image with those software installed and you could try. Other websites where you could always find excellent tools for research is rOpenSci and Rstudio.

Thanks again, Yihui! Actually you developed most of the packages.