2020 Covid-19

Aim

This collection aims to capture a maximum of information about the coronavirus pandemic in Luxembourg. We started it on 18th March 2020 and intensively collected data. Since autumn 2020, we’re continuing the collection at a reduced pace.
The scope around the subject is broad: we are not only capturing all aspects of the pandemic itself, but also the resulting changes and discussions in Luxembourg.

Seed list

Seeds serve as a starting point for web crawls. One seed can lead to a number of different pages. The more seeds are used, the more extensive the results of a web harvest will be.

Open seed list

Coverage

While we have good coverage of websites (some with one-time captures, others with daily and weekly captures) and news media outlets (individual articles and daily crawls), social media platforms are not captured as extensively. In fact, daily crawls are in place only for Twitter pages, while data from Facebook pages and Youtube channels is limited. This is due to high data budget costs.

Foreign websites

We capture international news articles related to travel restrictions to and from Luxembourg as well as official sources to this subject from Luxembourgish and cross-border authorities. The harvested news articles are not exclusively related to the situation in Luxembourg but broadly address covid-19 and the consequences of the coronavirus pandemic all over the world.

What we captured

About the collection

We started the covid-19 collection on 18th March 2020. On that date, there were 81 known cases in Luxembourg. One of the first steps was to go through all news outlet pages, gathering every relevant article about the coronavirus. At this stage, it was already clear that the coronavirus would become the single most extensive subject we had ever captured before – both in terms of its impact on society as well as on the internet.

Participation from the public

For other collections, we have already launched calls for participation and invited political parties to share their online presence in order to improve the lists we use for web harvests. For the covid-19 collection, we made a call to a larger audience since the crisis has touched every aspect of society and community, making it impossible for us to research and detect every relevant website for the collection on our own. The response was overwhelmingly positive as we received many contributions from smaller communities and minorities whose experiences didn’t get any coverage in the news. Without the call for participation, we would have thus been unable to capture this vital aspect of life during the coronavirus pandemic.

Priorities and methods

Our coverage on covid-19 focuses on websites, news outlets and Twitter. Unfortunately, we have limited coverage from Facebook, Youtube or other social media platforms. This is because we have to prioritise our technical resources over an undetermined period of time and the end date for the collection has not yet been defined. There are different methods that we’ve combined since March 2020:
– Manual crawls with the Archive-It tool
– Domain crawls in collaboration with the Internet Archive
– Continous online news media crawls in collaboration with the Internet Archive
– Manual crawls with Webrecorder/Conifer

Scope and limitations

These methods vary in the kind of sites that are captured, the number of URLs included in each webcrawl and the frequency of captures. For instance, some websites are only captured once, whereas media outlets are being captured on a daily or weekly basis since June. As manual crawls with Archive-It are limited by a data budget, we have to select the seeds for each category carefully. This method allows for a higher frequency and focuses on high priority sites. Nevertheless, we still aim to capture the larger picture, such as collecting data from all websites with a .lu domain., Such large-scale domain crawls are done twice a year. For the covid-19 collection, we were able to add an additional domain crawl, beginning of April, increasing the rhythm of such larger scale captures of the Luxembourgish web to a 4 month rhythm between December 2019 and December 2020.

Collection end

Due to the unpredictable nature of the pandemic and unprecedented situation in Luxembourgand the world, it is difficult to determine a fixed end to the crawls in this collection. We plan to keep collecting while the subject is still dominating the news, the internet and people’s lives, yet we have to manage our resources which may affect the pace of harvesting.

Articles and presentations

RTL Interview

04. 04. 2020

RTL

WARCnet Presentation

04. 05. 2020

Youtube

IIPC Blogpost

23. 06. 2020

IIPC

WARCnet Paper

24. 09. 2020

WARCnet

C2dH Roundtable at Uni.lu

15. 09. 2020

C2DH

De Stëllefëller

Mir hunn dem Serge Tonnar e puer Froe gestallt zu sengen Erfahrungen aus der Coronakris vun de leschte Méint, wéi seng Aarbecht online virugaangen ass, a wéi seng Astellung zu soziale Medien ausgesäit.

Interview op Lëtzebuergesch

Heading 3

The Silencefiller

We asked Serge Tonnar about his experiences from the Coronavirus pandemic over the past few months, how his activities have continued online and about his stance on social media.

Interview in English

Aim