Build CSV with persons and institutions listed in http://cths.fr - 2019, php
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Thierry 4ab0c4333e Compute associations savants - societes 3 years ago
model Compute associations savants - societes 3 years ago
run Compute associations savants - societes 3 years ago
vendor/tiglib Compute associations savants - societes 3 years ago
.gitignore Initial commit with steps 1,2,3 3 years ago
README Start step 4 (generate csv) 3 years ago
config.yml.dist Start step 4 (generate csv) 3 years ago

README


--- pending : parse savant page in run/4-csv.php ---

Code to build CSV files containing persons and institutions
listed by http://cths.fr (Comité des travaux historiques et scientifiques).

See format of resulting csv files below

Realeased under the General Public Licence (GPL), version 3 or later.
Written with php 7.2, executed under Linux (Ubuntu 18.4)

Departure url :
http://cths.fr/an/liste.php?sc=&pays=0&domaine=0&periode=0&page=1

Installation
------------------------
Install PECL extension php-yaml :
sudo apt install php-yaml

Preparation
------------------------
- Copy config.yml.dist to config.yml
- Adapt the paths to your local machine

Execution
------------------------
In a console, from current directory (containing this README) :

Retrieve on local machine the pages containing lists of "sociétés savantes" (institutions)
php run/1-liste-societes.php

Retrieve on local machine the pages of sociétés savantes
php run/2-societes.php

Retrieve on local machine the pages of "savants" (members of institutions)
php run/3-savants.php

Execution result
------------------------
Execution done on 2019-10-21 and 22

Step 1
Retrieved 81 pages of sociétés savantes on local machine

Step2
Retrieved 4005 pages of sociétés savantes

Step3
Retrieved 26438 pages of savants

Downloadable result
------------------------
2 csv files
First line contains field names, other lines contain data
Semicolon (;) is used as field separator

societes.csv
------------
- ID : unique CTHS identifier of a société
- NAME : name of the société

savants.csv
------------
- ID : unique CTHS identifier of a savant
- FNAME : family name of the savant
- GNAME : given name of the savant
- BDATE : birth date
- BPLACE : birth place
- DDATE : death date
- DPLACE : death place
- SOCIETES : ids of societes to which the savant is or was member
List of numerical ids separated by a plus sign (+)