Knowledge clip: Keeping research data organized

preview_player
Показать описание
Keeping files organized during your research is a key aspect of data management. In this knowledge clip we have a look at the different aspects of file organization (file naming, folder structure and version control), and provide tips and best practices.

0:00 Keeping data organized: Why file organization?
0:22 What does file organization involve?
0:39 Develop a file naming convention
1:49 File naming: best practices
2:27 Folder structure
2:56 Folder structure: best practices
3:33 Version control
3:53 Version control strategies
4:54 File organization: final tips
Рекомендации по теме
Комментарии
Автор

I use random password generators for file naming. Every time I open a file, it's like a tiny surprise party or similar to the rush I get when hitting a good roll at the casino if the file is actually the one I'm looking for. I like to live under constant stimulation and stress, it helps me feel alive.

G_Whiz
Автор

One often overlooked aspect is a depreciation strategy. At some point documents may no longer be relevant, and one wants them out of sight of the normal document archive, while still being able to find them with relative ease.

jenswurm
Автор

This is excellent!

If you don't have revision control software, always have a folder for your source material, and copy that material out to work on it.

So many people accidentally edit the source image.

lazygardens
Автор

The best way to organize files is to study taxonomy and apply those principles to the structure.

lktstuff
Автор

Ghent University Data Stewards - Excellent video. Maybe the best I’ve seen on folder / file organization.

However, I disagree with you on some aspects of this.

1 - Folder Names and Structure - Don’t design your folder names around file attributes. That is what tags are for. Folder names should be “categories” of topics/objects/nouns. The structure should be hierarchical. - - - I do agree that the “categories” should not overlap. Some professional consulting firms have developed a concept / methodology for organizing information called MECE. It stands for mutually exclusive (ME - no overlap) and collectively exhaustive (CE - nothing overlooked). - - - If it appears that a file can logically fit into more than one folder, that’s a signal that either the hierarchical file structure needs to be changed or tags should be used.

2 - File Names - Don’t put dates in file names. If you do, files with common topics probably can’t be sorted and listed together. Windows and Mac operating systems create and maintain file “Create Dates”” and “Last Modified Dates”. If some other date (like “Due Date”) is important, add it as a tag (attribute) of the file. - - - Unquestionably, the best and most understandable / relatable file naming scheme is to make the names be specific cases of a topic/object/noun with are deeper in the hierarchy than the folder names.

jimgrant
Автор

Don't store all the information in the filename! Use metadata instead. This makes sorting, filtering and grouping much easier. Think of it as dynamic folders.

pieterkops
Автор

En la carpeta en donde vayan a estar varias revisiones siempre pongo una carpeta llamada "_Superados". Esto con la finalidad de que muevo para alla todas las versiones obsoletas y solo me queda la ultima, pero no me deshago de las anterior por precaución. Así luego de 1 año o más, cuando voy a buscar el ultimo reporte, no tengo que lidiar con una lista de versiones.

edwingonzalez
Автор

Most filesystem also provides ways to add tags to files, tags that can be used when searching for a specific file or a group of files.

CrazyDriverSwed
Автор

What software do people typically use for version control of research data?

EverCraft_File_History
Автор

I make a point of never using "final" in the file name: best way to jinx the project…😆

vaughngaminghd
Автор

Any tip to do if I have a lot of audio files or video files? In my example I have like 30 audios of the same event. Any idea?

danielcraft
Автор

I would also add a strong recommendation to avoid if possible binary formats. It is almost trivial to track or spot differences between say two TSV files without opening them (git and diff).

darked
Автор

JUST AMAZING! CAN'T BELIEVE THAT YOU ONLY HAVE 2065 VIEWS.

sebastianpozo
Автор

You need git or related tools to tracking your project files states

mflowsi
Автор

The underscore disables Windows search capability for file names.

quochuynh
Автор

These were "best practices" in 1989. The world needs to move from data categorization that is based on "where", to one that is based on "what". This means that categorization is only accessorily related to folder trees ; it should be primarily done via metadata, or emergent metadata extracted by modern search engines. This old way to do things is based on the concept of taxonomy ; information is not best done in a hierarchy but in a network.

phpn
Автор

If using the date as file name I would start with year then month then day e.g. 2023 07 04

ElCidPhysics
Автор

Habría que ver qué dice el paper pero yo creo que que la IA tiene memorizada la película y lo que prevee es el minuto que está viendo el ratón en base a la lectura de las ondas cerebrales

valerio
Автор

Dont use acronyms!
You will forget it !
Your team mates will ask what is it!
It is not the old era of pc where every bit counted and screens were small just write a long name it is ok !

As for versions use date and the time of day in 24 hour format

Like 2023/12/24-1830

By the way pro time every hour of edit save a version!

And this system will allow you to keep going it will push you once you see the files and notice that you have not been working your brain will say oh start work look there are dates we didn't work!

atlasstone
Автор

a bit obnoxious to call your own preferred folder organization style as "Best Practices"

dannytan