We have successfully deployed our new Media Cloud Online News Archive, now the default platform in Media Cloud Search. Currently, our index returns news content published from early November to the present. While we are excited to get a more stable and sustainable platform online and available, we acknowledge the presence of bugs as we adapt to the new system. Our small team is working through any data consistency and user interface issues we (and others) spot, and we appreciate your patience.
We are planning system optimizations scheduled for implementation from 2/2/24 to 2/4/24. During this period, our web tools will be temporarily unavailable as we roll out some changes to enhance the overall functionality and performance of our platform.
Looking ahead, our focus in the upcoming year will be on re-indexing the remainder of our over 1.5 billion archived news stories. We anticipate completing the indexing for the rest of 2023 by April 1st of this year, with subsequent years processed at an accelerated pace, estimating 1-2 months per year of backlog.
We are also maintaining the connection to the Internet Archive Wayback Machine’s instance of our news data as a searchable platform. The Internet Archive is one of our collaborators, and serves as a redundant searchable archive of the online news URLs our system discovers for indexing. It is worth noting that their index is managed independently by their team and not something we maintain nor audit.
For more information on interacting with our Media Cloud Web Search API, including setup instructions and example API calls, please visit this link. We enjoy working with you all and supporting your research, and welcome any efforts to collaborate on our open-source software or grant applications if you have resources available.