Medialab Sprint March 18-20 + datasets on adaptation

This is to announe our three-day sprint on adaptation. The aim of this sprint is less to produce final maps (visualizations ready to be published) than to develop a clear and as-simple-as-possible procedure to extract data from the datasets listed below. The expected outputs are processed data tables that could be:
- easily opened and exploited with tools such as Gephi, Libreoffice, ManyEyes and such;
- easily updated as new data pours into the datasets without too much technical intervention.

After the have obtained these, we will meet and discuss with the users.

We can make these datasets/corpuses available to you all if needed. They can also be discussed in the next EMAPS meeting.


DATASETS
[Unities = rows/documents/unit level ; Entities = columns/features/subelement/annotations,/fields/data-object]


1. Scientific literature in Scopus on adaptation
  1. Source: http://www.scopus.com/
  2. Query: TS=(“climate change” OR “global warming”) AND TS=(adapt* OR vulnerab* OR resilien*), from 1992
  3. Unities: bibliographical references
  4. Entities: Authors*, Title*, Year*, Source title*, Volume, Issue, Art. No., Page start, Page end, Page count, Cited by*, Link, Affiliations*, Authors with affiliations*, Abstract, Author Keywords*, Index Keywords*, Molecular Sequence, NumbersChemicals/CAS, Tradenames, Manufacturers, Funding Details*, References*, Correspondence Address*, Editors, Sponsors*, Publisher, Conference name*, Conference date, Conference location, Conference code*, ISSN*, ISBN*, CODEN*, DOI*, PubMed ID, Language of Original Document, Abbreviated Source Title*, Document Type*, Source, Link*, URL(?)
  5. Number of unities: about 15.000
  6. File formats: CSV / BibTexT
  7. Notes: we can use the link in csv file between unities and online article*
TO DO during the sprint : prepare a script that compile and parse a csv exported from Scopus and generates various gexf (for relational information) and csv (for temporal series).

2. Scientific literature in ISI on adaption

  1. Source: http://webofknowledge.com/
  2. Query: TS=(“climate change” OR “global warming”) AND TS=(adapt* OR vulnerab* OR resilien*), from 1992
  3. Unities: bibliographical references
  4. Entities: FN: File Name, VR: Version Number, PT: Publication Type, AU: Authors*, AF: Author Full Name, CA: Group Authors, TI: Document Title*, ED: Editors, SO: Publication Name*, SE: Book Series Title, BS: Book Series Subtitle, LA: Language, DT: Document Type*, CT: Conference Title, CY: Conference Date, HO: Conference Host, CL: Conference Location, SP: Conference Sponsors, DE: Author Keywords*, ID: Keywords Plus®*, AB: Abstract*, C1: Author Address, RP: Reprint Address*, EM: E-mail Address, FU: Funding Agency and Grant Number*, FX: Funding Text, CR: Cited References*, NR: Cited Reference Count*, TC: Times Cited*, PU: Publisher, PI: Publisher City, PA: Publisher Address, SC: Subject Category, SN: ISSN*, BN: ISBN*, J9: 29-Character Source Abbreviation, JI: ISO Source Abbreviation*, PD: Publication Date, PY: Year Published*, VL: Volume, IS: Issue, PN: Part Number, SU: Supplement, SI: Special Issue, BP: Beginning Page, EP: Beginning Page, AR: Article Number, PG: Page Count, DI: Digital Object Identifier (DOI)*, SC: Subject Category*, GA: Document Delivery Number, UT: Unique Article Identifier, ER: End of Record
  5. Number of unities: about 10.000
  6. File format: TXT
TO DO Dduring the sprint: prepare a script that compile and parse a txt exported from ISI WoK and generates various gexf (for relational information) and csv (for temporal series).

3. IPCC reports

  1. Source: http://www.ipcc.ch/publications_and_data/publications_and_data_reports.shtml#.USNcz1p4a94
  2. Query: IPCC Report 1, 2, 3, 4, 5 (text not available yet)
  3. Unities: Reports of the 3 working groups, Synthesis reports, Special reports (not yet done)
  4. Entities: Title of report, Date, Contributor name, Contributor role, Contributor institution, Contributor nationality, Institution country, Contributor chapter, Chapter’s title. Analytical entities : “chapter content” adressed by reports (e.g. sea level rise), Contributor’s “chapter content”
  5. Number of unities: about 30
  6. File format: PDF
TO DO during the sprint: Find intelligent ways to visualize the tables of contributors to the 5 IPCC reports.

4. Earth Negotiations Bulletins on UNFCCC

  1. Source: http://www.iisd.ca/vol12/
  2. Query: all the documents on the page
  3. Unities: Daily reports of negotiation, Summaries of negotiation meeting
  4. Entities: Volume*, Report collection number*, Meeting type*, Meeting number*, Date* Meeting place*, Report type*, Text (within the documents titles are in bold – look for “adaptation” and “in the corridors” in title)*
  5. Number of unities: about 600
  6. File format: HTML / HTML and PDF
TO DO during the sprint: Analyse with Pattern the pds and extract monopartite network of countries names appearing in the same sentences; bipartite network of countries names and issues nominal groups appearing in the same sentence; monopartite networks of issues nominal groups appearing in the same sentence; network of who is speaking of whom “NAME OF COUNTRY for NAME OF THE GROUP”

5. UNFCCC documents

  1. Source: http://unfccc.int/documentation/documents/items/3595.php
  2. Query: Document type : “Meeting documents – Submissions by parties and organizations”
  3. Unities: position papers of countries before UNFCCC meeting,
  4. Entities: url, pdf_filename, symbol*, title, authors*, pdf_url, pdf_language, abstract*, meeting*, doctype* (meeting papers, draft conclusions, party submissions, in-depth review, meeting reports, workshop documents, IGO submissions, submissions by parties and organizations, compilation and synthesis reports, technical papers, reports by the Secretariat, progress reports, NGO submissions, submissions, resolutions and decisions, national adaptation programmes of action (NAPA), treaties, synthesis and assessment reports, scenario notes), topics*, keywords*, countries*, pub date*, year*, text (pdf)*
  5. Number of unities: about 800
  6. File format: HTML and PDF
TO DO during the sprint: Index the documents and query it manually to generate various gexf (for relational information) and csv (for temporal series).


6. UNFCCC side events

  1. Source: http://regserver.unfccc.int/seors/reports/archive.html
  2. Query: all documents
  3. Unities: table of side events at all sessions from 2003 onwards
  4. Entities: session name, event title, event organizer, event date
  5. NB : possible to retrieve attachments (pdf) to each side events
  6. Number of unities: 26 tables
  7. File format: HTML
TO DO during the sprint: Scrape the table of side events and transform it in a csv

7. weADAPT.org, through their website API

  1. Source: http://weadapt.org/, http://api.weadapt.org/docs/
  2. Query: API Access
  3. Unities: placemark (= project), organization, initiative (to ask weADAPT)
  4. Entities: author name, organization name/acronym, placemark latitude, placemark longitude, placemark title, tags
  5. Number of unities: about 1000
  6. File format: XML
TO DO during the sprint: Generate bi-partite network of organisations and projects; mono-partite network of organisations (sharing projects); mono-partite network of projects (sharing organisations). Generate bi-partite network of tags and projects; mono-partite network of tags (sharing projects); mono-partite network of projects (sharing tags). Generate bi-partite network of tags and organisations; mono-partite network of tags (sharing organisations); mono-partite network of organisations (sharing tags).

8. Websites on adaptation through Hyphe

  1. Source: Hyphe
  2. Query: Seeds websites decided by experts
  3. Unities: Websites and hyperlinks

TO DO during the sprint : Crawl the seed, sieve the neighbours, iterate 2 or 3 times.

2 Responses to “Medialab Sprint March 18-20 + datasets on adaptation”

  1. It sounds great. how/where will be the data published (e.g. an open data portal)? Will the identified procedures be coded and released?

  2. Well yes I hope the data will be made available to all, along with all necessary procudres/code whatever . How to do that ? is a question we need to address in the upcoming meeting, especially during the technical meeting on the last day (April 18th). I think you or someone at Density should attend that meeting (more details to come soon from the Amsterdam team).

Leave a Reply

Spam protection by WP Captcha-Free