Open Infrastructures and the Future of Knowledge Production, part 2

In my last post, I unpacked some of the reasons why open infrastructures matter for the future of knowledge production, and I talked a bit about how Humanities Commons and strive to live out their principles of community governance that truly open infrastructure requires. But I ended on a less cheerleadery note: We aren’t a perfect alternative to the corporate platforms by which we’re surrounded. And this is where we need to dig down into the dirty underside of digital infrastructure. As Deb Chachra points out, the term “infrastructure” literally points to those systems that are hidden, in our walls, under our floors, and buried underground. If we are going to mitigate the inequities created by and sustained through our infrastructures, we have to get busy unearthing those systems and finding ways to build new ones. 

And so: We need to take a hard look at the fact that the infrastructure that Humanities Commons is built upon is AWS, or Amazon Web Services. As you might guess from the name, AWS is part of the Greater Jeff Bezos Empire, and every dollar that we spend to host with them helps to keep that empire running. And run it does! Amazon’s revenue derived from AWS passed $80 billion-with-a-b in 2022, and as of August 2023, AWS hosted 42 percent of the top 100,000 websites, and 25 percent of the top one million (ironically enough including BuiltWith, the site from which these data are made available).

Why has Amazon become such a powerful force in web hosting and cloud computing? Largely because they provide not just servers but a powerful and wide-ranging suite of tools that help folks like us not just make our platform available but also help keep it stable and secure and enable it to scale with enormous flexibility. AWS provides connected equipment and tools that would be more than a full-time job for someone to maintain in-house, and it enables redundancy and global reach at speed, and it’s relatively easy to manage.

So… it works for us, just as it works for 42,000 of the top 100,000 websites across the internet. But I’m not happy about it. It’s not just that I hate feeding more money into the Bezos empire every month, but that I know for certain that our values and Bezos’s do not align. And every so often I have to stop and ask myself how much good it does for us to build pathways of escape from the extractive clutches of Elsevier and Springer-Nature, only to have those pathways deliver us all into the gaping maw of Amazon?

AWS has a stranglehold on web-based platforms of our size, as we’re too complicated for a server kept under the desk, too big for a smaller hosting service, and too small for our own data center. And if you don’t want to deal with the risks and costs involved in owning and operating the metal yourself, there just aren’t many alternatives, and certainly not many good ones.

Our host institution, Michigan State University, like most institutions its size, operates both a large-scale data center through our central IT unit and a high-performance computing center under the aegis of the office of research and innovation. The latter can’t really help us, as it’s focused pretty exclusively on computational uses and not at all on service hosting. And the former comes with a suite of restrictions and regulations in terms of access and security – pretty understandably so, given recent attacks and exploits such as the one that caused our neighbor to the east to disconnect the entire campus from the internet on the first day of classes – but nevertheless restrictions that make it impossible for us to be flexible enough with our work.

In fact, central IT strongly encourages projects like ours to make use of cloud computing, given the complexity of our needs and the risk-averseness of the campus. And we have our pick! AWS, Microsoft’s Azure, and Google Cloud Services.

I just can’t help but think that it’s a Bad Thing for academic and nonprofit services like ours – services that are working to be open, and public, and values aligned with our communities – to be dependent upon Silicon Valley megacorps for our very presence. We need alternatives. Real alternatives. And I fear that we’re going to have to invent them, because as the example of open access publishing demonstrates, waiting to see what commercial providers come up with is certain to increase our lock-in, and increase the level of resources they extract from our campuses.

So what might it look like if our infrastructure for the future of knowledge production and dissemination was community-led all the way down? What might enable the Commons to leave AWS behind and instead contribute our resources to supporting a truly shared, openly governed, not-for-profit cloud service? Could such a service be collaborative, with all member research institutions and organizations paying into a shared, professionally staffed data center?

King’s College London and Jisc think so – they established the first collaborative research data center in the world nine years ago, precisely in order to help UK institutions achieve economies of scale, to increase energy efficiency, and to reduce costs. Of course, it’s a lot easier to get all the UK institutions of higher education on board with such a centralized initiative, partly because there are fewer of them and partly because they are all centrally funded.

But what if Internet2, for instance, instead of restricting its areas of interest to networking and protocols, and instead of offering to connect member institutions with corporate cloud services, instead provided a real alternative – one that was not just developed for the academic community but that would be governed by that community? What if each member institution or organization agreed to contribute its existing infrastructure, along with its annual maintenance budget, to a shared, distributed, community-owned cloud computing center? Could excess capacity then be offered at reasonable prices to other nonprofit institutions or organizations or projects like mine, in a way that might entice them away from the Silicon Valley megacorps? Would our institutions, our libraries, our publishers, and our many other web-based projects find themselves with better control over their futures?

None of what I’m suggesting here would be easy, and a lot of the questions I’ve just asked fall – at least for the moment – into the realm of the pipe dream. But if we were to be willing to press forward with them, we might find ourselves in a world in which the scholarly communication infrastructures on which we build, develop, design, and publish our work can help us foster rather than hinder social and epistemic justice, can empower communities of practice by centering their needs and their work to meet them, and can enable trustworthy community governance and decision-making in support of truly open, public, shared infrastructures for the future of knowledge production.