Part 1: Import Procedure Hugenot Database

Steps	Objective	People	Status
0. Further digitize data Hugenot Data	Remaining information of the Hugenot Database is digitized and also ready to be imported.	Mathilde, Oriane, Vincent, Morgane	ongoing
1. Establish internal work database	We have an infrastructure to internally accomodate a database that we can use to inspect and analyze the data	Gaétan & Francesco	Done
2. Import & transform Hugenot Data into work database	Data is imported from several tables into 1 table making it possible to easily inspect and analyze the data	Gaétan & Francesco	Done
3. Inspect the data & describe “potential” & “challenges”	Inspect the data so that we understand: how are the different main classes expressed? Places People Value of donations Professions … answering the following questions what are the different types of these classes? are there different ways of writing them? are they	Gaétan, Oriane, Supervision: Francesco	open
4. exchange with PI/ researchers on data potential & challenges & decide priorities	Exchange between data team and principal investigator/ researcher so that we understand: what kind of data do we have and therfore can we use for research? what are the detailed research questions? Out of this, it is decided which information is how (a) cleaned and (b) aggregated For example, we might have interesting data on: quantitative values of donations by instution over time migration of people and family (grouping their appearances) information of origine and professions information on family structures Based on the feedback in step 4, different data aggregation approaches might be chosen (focusing on a specific aspect of the data)	Mathilde, Gaétan, Oriane, Francesco
5. build initial datamodel	Based on this exchange, reflection on how the different main classes identified can be modelled in the current datamodel	Francesco with support of Mathilde, Oriane, Gaetan
6. Test data aggregation worfklows based on priorities	In the work database, the Hugenot data is grouped according to clustering-algorithms (publishing the data on the fly on a sparql-endpoint) Clusters could include: family structures life journeys Clusters	Gaétan, Oriane, Supervsision: Francesco
7. inspect the aggregation clusters	Based on the draft clusters, first analysis results become visible and allow the research team to understand analysis-potential. This step is important to decide on the data-matching and wrangling algorithms to be used for the data import	Mathilde, Gaétan, Oriane, Francesco
8. Adjust & improve aggregation cluster	Adjust aggregation algorithms according to discussions	Gaétan, Oriane, Francesco
9. Build controlled vocabularies & lists of main classes & relate them with work database	In this step, “clean” lists of specific classes, (in the sense of controlled vocabularies) are built in order to prepare for the import. e.g. These vocabularies and their identifiers are then associated with the values in the database. This can for example include: list of all places list of all professions list of all institutions list of currencies etc.	Oriane & Mathilde
10. Building detailed semantic data model	In this step, based on the above decided research priorities, the data model is fine-tuned.	Francesco with support of Mathilde, Oriane, Gaetan
11. Final cleaning of all data selected for import	Data in the internal database is then cleaned	Mathilde, Gaétan, Oriane, Francesco
12. Data mapping, matching & Import	Data is then mapped on the data model, merged according to the selected & tested merging/aggregation algorithms and the data imported into the Geovistory environment (as facts or factoids, depending on the strategy chosen)	Gaétan, Francesco