Sia S3 Integration: IPFS

Creating a Public IPFS Gateway Backed By Sia

Skunk_Ink
The Sia Blog

--

IPFS is a distributed file system that allows users to store and share files in a decentralized manner. It is an excellent tool for sharing data but requires running a node on your computer to access the network and transfer data between peers. In order to make IPFS more accessible on the web, companies and individuals set up public IPFS gateways. These gateways allow users to access and download IPFS files through a centralized web server using standard HTTP requests. This article walks through the process of first setting up an IPFS node using Sia as the backend data store through renterd and then exposing the node as a public gateway.

Sia is a decentralized storage network that allows users to store and retrieve data on the Sia network. The renterd software enables securely storing and retrieving data on the Sia network by forming contracts with storage providers worldwide. When using renterd, data is erasure-coded and distributed across geographically diverse nodes. Erasure-coding allows data to be reconstructed from a subset of the total data. This means that even if some nodes go down or become unavailable, the data can still be reconstructed. By default, renterd uses 10-of-30 erasure coding. This means the data can be reconstructed from any 10 of the 30 nodes.

I have little experience with IPFS outside Brave’s built-in IPFS node. This article is both a basic guide and a learning experience for me. There is very little chance it is optimized and a very high chance it is incorrect in some places. There are much better options, but I took what appeared to be the most common path for new IPFS users and used Kubo as the base for the gateway.

tl;dr:

Armed with my limited knowledge of IPFS, the plan is to use Kubo and connect it to the renterd S3 gateway as the data store. As an extra bonus assignment, I’ll make the IPFS node available publically with a pretty domain name by putting Caddy in front as a reverse proxy.

Things We’ll Use

Things We’ll Need For IPFS

  • renterd — Sia’s new renting software. Enables users to store and retrieve data on the Sia network by forming contracts with storage providers and repairing data. It’s a bit complex at the moment, but it’s getting better every day.
  • Kubo — The original IPFS gateway. Kubo is written in Go and is a great starting point for building your own gateway. There are better and more involved methods, but we’re going to stick to the easiest one for the sake of this article.
  • go-s3-ds — A plugin for Kubo that allows storing IPFS data in an S3-compatible data store. This is the key to making this all work. Kubo is a great starting point, but it doesn’t support Sia out of the box. Luckily, renterd comes with an S3-compatible API that we can use.

Additional Things for a public gateway

  • Caddy — Caddy is a great web server that makes setting up a reverse proxy for the gateway easy.
  • A domain name — We’ll use a custom domain name to make the gateway easier to access and to enable SSL.
  • A server — We’ll need a server to run renterd and the gateway on. It needs at least 128 gigabytes of storage to store the Sia blockchain, renterd database, and IPFS config files.

Setting up renterd

The first step is to set up renterd and ensure it works properly. We will gloss over the exact steps to configuring a new renterd node and focus on the important parts. If you want to learn more about renterd, check out the official docs. There’s a lot of waiting for the blockchain to sync, then for the initial wallet transaction to be confirmed, and then for contracts to form. This is a bit of a process, but it’s not too bad.

The S3 gateway is enabled by default, but you still need to set up an access key and secret key for auth. Keys can be added in the renterd.yml config file. If you don’t have one, you can create one under the same directory as your renterd binary. Eventually, you’ll be able to manage S3 keys directly from the UI, but for now, you need to add them manually.

s3:  
enabled: true
keypairsV4:
my-s3-access-key: my-s3-secret-key

With that added, we’re ready to start renterd. Since this node is brand new, we’ll be waiting a while for it to sync, find hosts, and form contracts.

While waiting, setting up the allowance, autopilot configuration, and host blocklists is a good idea.

On the “Autopilot” page, I have found a minimum value of 10 terabyte for expected storage, and 15KS for allowance works well for renterd. The suggested values are okay for the rest of the configuration values. renterd’s Autopilot automatically selects storage providers, forms and renews contracts, and repairs data for you in the background keeping your files healthy.

Moving on to the “Configuration” page, make sure “Upload Packing” is turned on. This allows renterd to transparently pack files, getting around the minimum file size restrictions.

Finally, on the “Wallet” page grab the wallet address and send some Siacoins to it. After that, comes a long wait for contracts to form and syncing to finish.

Setting up Kubo

Now that renterd is running, the next step is to set up Kubo. Kubo is a great starting point for building a public IPFS gateway. It’s written in Go and is easy to modify. There are much more optimized options, but I’m going to use the most well known and maintained one for this article.

I thought Kubo would be the easy part, but it turned out to be only slightly less annoying than renterd. There was no easy way to add the S3 datastore plugin. I had to build Kubo from the source and manually add the plugin. The Kubo documentation is pretty good. But it would have been nice if it had already included the S3 datastore plugin. Luckily, the go-ds-s3 repo gives a handy script to bundle the plugin with Kubo.

# We use go modules for everything.
export GO111MODULE=on

# Clone kubo.
git clone <https://github.com/ipfs/kubo>
cd kubo

# Pull in the datastore plugin (you can specify a version other than latest if you'd like).
go get github.com/ipfs/go-ds-s3/plugin@latest

# Add the plugin to the preload list.
echo -en "\\ns3ds github.com/ipfs/go-ds-s3/plugin 0" >> plugin/loader/preload_list

# ( this first pass will fail ) Try to build kubo with the plugin
make build

# Update the deptree
go mod tidy

# Now rebuild kubo with the plugin
make build

# (Optionally) install kubo
make install

Next, we need to initialize IPFS. A lot of the configuration is taken from using ipfs init --profile server. This will create the necessary config files and directories.

Now, we need to tell Kubo to use the S3 data store, and I’m going to change a few other values with the ones I gathered by talking to random people on the internet. By default, IPFS stores files locally on your computer. We want to upload them to Sia instead. The default config looks like this:

{
...
"Datastore": {
"StorageMax": "10GB",
"StorageGCWatermark": 90,
"GCPeriod": "1h",
"Spec": {
"mounts": [
{
"child": {
"path": "blocks",
"shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
"sync": true,
"type": "flatfs"
},
"mountpoint": "/blocks",
"prefix": "flatfs.datastore",
"type": "measure"
},
{
"child": {
"compression": "none",
"path": "datastore",
"type": "levelds"
},
"mountpoint": "/",
"prefix": "leveldb.datastore",
"type": "measure"
}
],
"type": "mount"
},
"HashOnRead": false,
"BloomFilterSize": 0
}
...
}

The first value I’m going to change is StorageMax. This is the maximum amount of storage that IPFS will use. I’m going to set it to 1 terabyte. The next value is BloomFilterSize, which is the size in bytes of the block store’s bloom filter. I’ve read that 1MB (1048576) is a good value, so I will set it to that.

Lastly, I need to edit the Spec section. This is where we tell IPFS to use the S3 datastore plugin instead of the default local datastore.

{
...
"mounts": [
{
"child": {
"type": "s3ds",
"region": "us-east-1",
"bucket": "ipfs",
"rootDirectory": "",
"regionEndpoint": "localhost:8080",
"accessKey": "my-s3-access-key",
"secretKey": "my-s3-secret-key"
},
"mountpoint": "/blocks",
"prefix": "flatfs.datastore",
"type": "measure"
},
{
"child": {
"compression": "none",
"path": "datastore",
"type": "levelds"
},
"mountpoint": "/",
"prefix": "leveldb.datastore",
"type": "measure"
}
],
...
}

Now that we’ve edited the config we have one more file to edit before we can run Kubo. For some reason, even though the node has never been run we still need to edit the datastore_spec file to change our data stores.

This is fairly simple; only a single line of JSON needs to be changed.

{"mounts":[{"bucket":"ipfs","mountpoint":"/blocks","region":"us-east-1","rootDirectory":""},{"mountpoint":"/","path":"datastore","type":"levelds"}],"type":"mount"}

Alright, now we’re ready to run Kubo. We can run it with ipfs daemon. This will start the IPFS daemon and Kubo. If everything is working correctly, you should see a bunch of output and eventually a line that says Daemon is ready.

Initializing daemon...
Kubo version: 0.24.0-dev-a4efea5c7-dirty
Repo version: 15
System version: amd64/linux
Golang version: go1.21.0
2023/09/30 18:58:48 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See <https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes> for details.
Swarm listening on /ip6/::1/tcp/4001
Swarm listening on /ip6/::1/udp/4001/quic-v1
Swarm listening on /p2p-circuit
Swarm announcing /ip4/.../tcp/4001
RPC API server listening on /ip4/127.0.0.1/tcp/5001
WebUI: <http://127.0.0.1:5001/webui>
Gateway server listening on /ip4/127.0.0.1/tcp/8080
Daemon is ready

First, let’s try to access the local web UI and upload a file. I have just the image for testing our gateway.

The web UI is working well. Let’s pin our first file.

Great! We now have a working IPFS node with data backed up to Sia’s decentralized storage. Now, that we have a file pinned and uploaded to the Sia network. We can check the renterd UI at http://localhost:9980 to see the pinned blocks.

Bonus Points: Setting up a Public Gateway

Next we’re going to set up a gateway to enable others to download our file directly from our node. Operating your own IPFS public gateway offers several advantages, including greater control over data and privacy, less reliance on centralized services, and greatly reduced costs. By managing your own infrastructure, you can tailor the gateway to your specific needs, manage its performance, and ensure a seamless experience for your users while contributing to the growth and resilience of the decentralized web.

Adding Caddy in Front

If you’re unfamiliar with Caddy, it’s a simple HTTP server. Like everything else in this article, it’s also written in Go. I find it much easier to work with than Nginx or Apache. I’ve installed the Caddy package on my Debian server. Now, I need to add my domain to the Caddyfile to forward IPFS requests to my running IPFS daemon.

For simplicity’s sake, I won’t set up the subdomain redirect today.

(cors) {
@cors_preflight method OPTIONS
@cors header Origin *

handle @cors_preflight {
header Access-Control-Allow-Origin *
header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE"
header Access-Control-Allow-Headers "Content-Type, Authorization"
header Access-Control-Max-Age "3600"
respond "" 204
}

handle @cors {
header Access-Control-Allow-Origin *
header Access-Control-Expose-Headers "Link"
}
}

example.com {
import cors

@ipfs path /ipfs/* /ipns/*

route @ipfs {
reverse_proxy 127.0.0.1:8080
}
}

All that’s left now is to restart Caddy to apply the configuration, generate an SSL certificate, and hope it works.

Success!

That’s it. We’re done. We now have our renterd S3 gateway connected to a running IPFS daemon and have set up a public IPFS gateway.

Closing Thoughts

Setting up all the bits and pieces takes some work, but the results are extremely good. 1 terabyte, including 3x redundancy, on Sia costs about $5/mo. Bandwidth can be an additional overhead, but adding caching in front of the gateway will help reduce the overall cost and dramatically improve performance. renterd and Kubo combined is a highly cost-effective method to pin files for public use.

renterd uploads are decently fast, but downloads can be slow trying to fetch all of the IPFS chunks. There is lots of room for optimizations and improvements in the renterd download code, so I expect this to improve significantly over time. Setting up a cache, like Varnish, would help alleviate any issues for frequently used files.

Looking forward

We’ve been working on a much more optimized IPFS integration, which we’re excited to share with the Sia and IPFS communities soon. FSD allows for seamless storage of IPFS blocks on the Sia network, leveraging Sia’s high performance decentralized storage network. With IPFS being the content addressing standard, Sia will be well-positioned to act as the storage layer for the many projects and applications built on web3.

If you are looking for a decentralized storage solution that is truly decentralized, secure, private, and resilient, consider Sia. You can learn more about Sia from our website and documentation. You can also join our community of data storage enthusiasts and developers.

--

--