Skip to article frontmatterSkip to article content

Science Gateways and Dataset Dissemination

The cloud can be a big help in making a datasets available to others in your field. The primary challenge is in dealing with ongoing storage fees and the extra egress charges that cloud platforms levy for downloads of your data. There are a few strategies towards dealing with this:

The rest of this article will go into these strategies in detail.

Storage-adjacent Computation

The approach that many cloud-hosted gateways take towards disseminating data is providing an experimentation platform, usually a JupyterHub, to their users. This way, rather than every user of the dataset downloading what they need to their own storage, they simply run their code or use tools hosted on cloud machines that have free access to the central dataset.

Case studies