What we captured
There are currently over 70 websites being harvested on a daily / weekly / monthly or quarterly basis.
There are still a lot of adjustments to do and the media landscape can change every day.
This will be an ongoing thematic collection and we will keep you posted with more detailed information, after completing the initial pilot project.
About the collection
Our domain crawls form the basis of the web archive. A large number of websites, harvested all at once, creating a “snapshot” of the Luxembourg web at a given moment. However, these crawls take around one month to complete, and we are only able to operate 2 domain crawls per year.
Naturally there are a lot of areas on the web, where we miss out on changes in between domain crawls. In order to complete the picture formed by the large scale crawls, we are also implementing thematic collections: concentrating on types of websites and topics which warrant more attention and more frequent captures.
The Internet is all about the latest buzz. Topics and events that occupy the flow of new information and mark a specific moment in Internet history. These events are captured in event collections, adding to the domain crawls and thematic collections. With different methods and different collections, all captures of all websites are integrated into the same web archive.
News media play an important role in all event collections. However, the scope is in this case always time-bound and the coverage of news media is limited to the duration of the project. This collection plans to offer an ongoing coverage of Luxembourg news media, with an evolving seed list and an adaptive harvesting strategy.