Discussion David Francesco 9.2.2024

Way 1

factoid import

Definition of features (there exist a list)
Clarify/validate proposals & features with researcher
Prepare excel files based on defined features to be filled by Jacopo
Produce data for the span of 10 years or 1 year every 5 years? (who does this?)
Excel import and linking of places, persons and institutions in Geovistory (ongoing)
Create factoid mappings in toolbox
Automatically produce audiences in Toolbox
Produce in Toolbox identified places, identified most important persons & institutions
Config of kafka-streams to produce graph out of factoid mapping
Develop webcomponent to display mash-up information from factoid mapping and factual information
Prepare webpage for Globalvat for public access (ev with interactive maps)

Way 2 -

full import

finish excel transcriptions
clean errors in data sheets (add participations where needed)
1. define standards to do so
define key entities of key classes
decide which key entities to be identified
1. decide which persons, groups, places have to be identified
2. create them in Geovistory to produce identifiant - by doing so, check whether they exist already in Geovistory database
prepare data sheets for import
1. interlink all identified places, people and groups with GV identifier
prepare import tables clearly identifying for each column the corresponding class

Discussion 23.02.

present: Jacopo, David, Francesco

API to query the entities in GV e.g with open-refine

produce audiences with the following information

                        "Date audience",
                        "Jour semaine",
                        "Identifiant audience\ncalculé",
                        "Heure",
                        "Modalité de renseignement heure (Liste fixe)",
                        "Durée maximale (minutes)",
                        "Modalité de renseignement durée (Liste fixe)",
                        "Type audience\nselon la source",
                        "Type audience (catégorie recherche)",
                        "Lieu",
                        'Folio dans le fascicule'
                        
                        
    partCol = [
                    "Identifiant audience\ncalculé",
                    "Personne reçue (comme indiquée)",
                    "Qualité personne (comme indiquée)",
                    "Groupe reçu\n(comme indiqué)",
                    "Qualité groupe\n(comme indiquée)",
                    "Mention d’accompagnement\n(comme indiquée)",
                    "Accompagnement",
                    "Détails"

Procedure:

prepare two tables in sql logic
1. 1 table on audience
2. 1 on participations
  1. if participant is non-identified: propositional object
    1. prepare a “participant description”
  2. if participant is identified: preparation of this table includes disamniguation on the level of groups and persons
  3. for this:
    1. option 1
      1. prepare an aggregated list of all persons from project data using group by function in OpenRefine
      2. check with GV API - does it already exist?
        if yes, add Identifier
        if no: add definition for these persons
    2. option 2
      1. prepare an aggregated list of all persons from project data using group by function in OpenRefine
      2. check in the GV Toolbox if the person exists, create it manually
      3. copy the ID into the “participation” table
send two tables to KL in sql logic

reflections:

first important step is to define a “rule” to differenciate between participations that are identified (with persons or groups) or participations with non-identified “participant descriptions”
maybe have more tables
- audiences
- participant (with foreign key to “table person” and to “audience table”)
- identified persons

on the level of identified persons :

option 1 or 2 can be chosen once the aggregated list under 1.a and 2.a is ready
Option 2 might be suitable for the “cardinals”, but not for 1000 of persons

next steps:

have a discussion with Laura on
- the “content” auf audiences & participations (see above)
- the characteristics of “participation description” that is relevant to be kept
  - number of person
  - nationality
produce code for 3 table, including table on persons (currently code to produce 2 tables right now)
- udienze
- partizpationi
- participant description
test code on “march 1939 table”
- produce the 3 tables
- test to identify persons, identify groups, identify participation descriptions
have a meeting to discuss together next steps on the 6th of March at 13h30

6.3.2024 Meeting

Vincent, Francesco

Discussion in Rome with Laura

attributes of audience selected & discussed

udienzeCol = [
    "Folio dans le fascicule",
    "Date audience",
    "Jour semaine",
    "Identifiant audience\ncalculé",
    "Heure",
    "Modalité de renseignement heure (Liste fixe)",
    "Durée maximale (minutes)",
    "Modalité de renseignement durée (Liste fixe)",
    "Type audience\nselon la source",
    "Type audience (catégorie recherche)",
    "Lieu",
    "Détails",
    "Recommandation",
]
partCol = [
    "Identifiant audience\ncalculé",
    "Personne reçue (comme indiquée)",
    "Qualité personne (comme indiquée)",
    "Groupe reçu\n(comme indiqué)",
    "Qualité groupe\n(comme indiquée)",
    "Mention d’accompagnement\n(comme indiquée)",
    "Accompagnement",

two kinds of participants

identified participants (persons or groups)
un-identified participants: (sets of persons)

aim:

rich data from famous
information on anything else

threshold

identified vs unidentified

Challenge with Mentions:

how to identify them
table for all mentions

What is goal of webpage?

website that can be searched to some degree
- audiences
- who participates
  - some identified high-level persons/groups
    - cardinals, bishops
  - participant descriptions, with associated attributes:
    - with type (religious other),
    - number,
    - nationality,
    - gender,
  - → allow for analysis of some particular aspects

mentions/participant description table (set or a person)
- string with everything as per source (Cardinal y, accompagnied by 3 jesuit brothers
- link to jesuits
- number “3”
- male/female

table mention

also discussed:

Special audiences
accompaniment

Notebook Jacopo allows to create three tables:

(using google API)

audiences:
participants
mentions

possibilities:

mentions with similiarities

in case it is difficult:

fewer participants and richer descriptions

Working Procedures

Data Pipeline Globalvat