Add blog post about self-hosting

This commit is contained in:
Elara 2023-02-12 13:56:50 -08:00
parent a9291fa1c9
commit c78d234293
2 changed files with 37 additions and 136 deletions

View File

@ -1,136 +0,0 @@
+++
title = "Using Go's runtime/cgo to pass values between C and Go"
date = "2022-09-22"
summary = "Calling C code from Go (or Go code from C) presents certain challenges, especially when Go values need to be used by C. This article discusses the `runtime/cgo` package added in Go 1.17 and how it can be used to solve these challenges."
+++
## The problem
Often, I come across a complex problem that I can't solve myself. This is the purpose of libraries, so just import one and you're done, right? Well, that works, until you come across a problem for which a library hasn't been written in your language (Go, in this case). What if you could use a library from another language, such as C, in your Go program.
This is what CGo is for. It is a Foreign Function Interface, meaning it allows you to call functions from another language. CGo calls functions from C. The issue is that, since C is a completely separate language from Go, it has its own rules for how it expects things to be done.
This limitation also means that C doesn't know how to handle a value from Go. Such values must be converted to values that can be understood by C. For example, if you have a `string` in Go, you can't simply pass that to C. It must be converted to a `*C.char` before doing so, since C doesn't know about Go's `string` type.
Here, we come across some challenges. Let's say you want your C library to do some processing and then call a method on a struct with the results of the operation. How do you pass a struct to C so that it can call a method? And how do you even call the method when C doesn't have methods? Well, `runtime/cgo` is the perfect solution for this.
There are two problems here. First of all, as I said, C doesn't understand Go's values and doesn't even have methods. Second, Go is a garbage-collected language. This means that it'll automatically clean up unused data (garbage) by deleting (collecting) it. Unfortunately, C has no way to know when this happens and Go has no way to know if C needs the value, so anything passed to C should not be collected until it's done using it.
## How does `runtime/cgo` help?
[`runtime/cgo`](https://pkg.go.dev/runtime/cgo) is a package added in Go 1.17, and its purpose is to solve exactly the problems I discussed above. It has a [`NewHandle()`](https://pkg.go.dev/runtime/cgo#NewHandle) method that takes any value and returns a [`Handle`](https://pkg.go.dev/runtime/cgo#Handle), which has the underlying type `uintptr`. This means it can be converted to a `C.uintptr_t`, which C can understand.
Internally, `NewHandle()` creates a unique integer for the value you've passed in, and holds a reference to it. This integer is what is returned by the function. Holding a reference to the value means the garbage collector will leave the value alone because it will believe that the value is always in use. So, problem solved, right? Well, kind of. We now have a `Handle`, which is an integer, but how do we call a method on an integer?
## How to use it
So, first, let's say we have a Go program like this:
```go
package main
import (
"runtime/cgo"
"fmt"
)
// #include <stdint.h>
import "C"
type Result struct {
value1 int16
value2 int64
value3 uint32
}
func (r *Result) Set(val1 int16, val2 int64, val3 uint32) {
r.value1 = val1
r.value2 = val2
r.value3 = val3
}
func (r *Result) String() string {
return fmt.Sprintf("1: %d, 2: %d, 3: %d", r.value1, r.value2, r.value3)
}
```
Now, let's say we want our C library to provide these results, set them, and then print the string returned by `*Result.String()`. How do we do this with `runtime/cgo`?
First of all, we need a way for our C library to create a new `Result` value. We'd do this in Go using a `NewResult()` function, and we'll do the same here, but using handles instead of returning the value directly:
```go
//export CNewResult
func CNewResult() C.uintptr_t {
result := &Result{}
handle := cgo.NewHandle(result)
return C.uintptr_t(handle)
}
```
This function creates a new `*Result` using `&Result{}`. Then, it creates a new handle for this value using `cgo.NewHandle()`. This handle is a `uintptr` as I mentioned above, so it can be converted to C's `C.uintptr_t` and returned.
Now we have a number corresponding to our `Result` value, but how do we call a method? Since C doesn't have methods, we'll need to create functions that call them from Go, but since C also doesn't have direct access to the value, we'll have to get it back out from the handle. Since Go is still holding onto the value, we just convert the `C.uintptr_t` back into a `Handle` and get its value. It'll return an interface{}, so we'll want to use a type assertion to get back the `*Result`.
```go
//export CResultSet
func CResultSet(handle C.uintptr_t, val1 C.int16_t, val2 C.int64_t, val3 C.uint32_t) {
// Get the *Result back
result := cgo.Handle(handle).Value().(*Result)
// Call the method we wanted to use from C,
// converting the C values back to Go values.
result.Set(int16(val1), int64(val2), uint32(val3))
}
//export CResultString
func CResultString(handle C.uintptr_t) *C.char {
// Get the *Result back
result := cgo.Handle(handle).Value().(*Result)
// Call the method we wanted to use from C,
str := result.String()
// Since string is a Go type, we'll need to convert to C's *C.char
// using this function that Go includes for us when we import C.
cStr := C.CString(str)
return cStr
}
```
As you can see, all you need to do is
```go
result := cgo.Handle(handle).Value().(*Result)
```
and you get the value back from C to do whatever you need.
Now, there's one more issue. As I mentioned before, Go is a garbage-collected language. What we did with the handles stopped the value from being garbage collected so that C could use it without worrying that it might be collected by the garbage collector. The issue is that since our value is no longer being deleted, if we keep making new ones, they'll just fill up the computer's RAM for no reason. To solve this, `Handle` has a method called `Delete()`, which removes the reference that `runtime/cgo` was holding onto, allowing the garbage collector to collect the value again. We need to call this from C so that it can notify us when it's done with the value.
```go
//export CFreeResult
func CFreeResult(handle C.uintptr_t) {
cgo.Handle(handle).Delete()
}
```
That's it. Using what we have created from C is pretty easy. Simply call the functions we created:
```c
// Go creates this file for us. It contains all the exported functions.
#include "_cgo_export.h"
#include <stdint.h>
#include <stdio.h>
void foo() {
uintptr_t result = CNewResult();
CResultSet(result, -1, 123, 456);
char* str = CResultString(result);
printf("%s\n", str);
CFreeResult(result);
}
```
Calling the `foo()` function should print:
```text
1: -1, 2: 123, 3: 456
```

View File

@ -0,0 +1,37 @@
+++
title = "My personal self-hosting journey"
date = "2023-02-12"
slug = "self-hosting"
+++
In this blog post, I'll delve into my personal experience of discovering and embracing self-hosting, and explore the advantages of this approach and why it might be worth considering for you.
For many years, I relied on free services such as Google Drive, Google Docs, Dropbox, Github, etc. and was very happy with them. I often wondered how companies could offer these services for free. It wasn't until later that I discovered the answer. Companies like Google, Microsoft, and Apple collect the data of their users and sell it to advertisers, who use it to target manipulative ads more accurately at people who are more likely to be influenced by them.
Moreover, users must trust both the service providers and the data purchasers to secure their information, prevent unauthorized access, and refrain from reselling it to entities that may pose a risk to the users' privacy. Historically, there have been numerous instances where both the service providers and data purchasers have proven to be unreliable and untrustworthy in their pursuit of higher profits.
This is where self-hosting comes into play. Instead of relying on corporations to store your data, you take control of it yourself, giving you complete control over your information and the ability to secure it to your standards. Additionally, it empowers you to access features that these corporations often restrict behind paywalls.
After learning about data collection, I was motivated to find a way to stop it. Initially, I tried using other service providers that claimed to prioritize privacy, but I soon realized that they couldn't guarantee their claims and often turned out to be just as problematic as other providers. That's when I began exploring the option of self-hosting and delving into the necessary knowledge, such as Linux servers, the internet, and various internet protocols. With time, I felt confident in my understanding and decided to take the leap into self-hosting.
At 14 years old, I saved up some birthday money, and acquired 5 Raspberry Pi 4s. I set them up in my room, and started trying things. At first, it was rough. I failed at all my attempts, but I was determined to continue, and eventually, I got my first service reachable from the internet. With that experience, I continued setting up more services. However, I eventually faced a problem - the Raspberry Pi models I purchased only had 1 GB of RAM, and I used 16 GB microSD cards for storage, which was insufficient for hosting some of the services I wanted. Additionally, this prevented me from using orchestration tools such as Kubernetes, as even the stripped down variants required more RAM to function properly.
After a while, I decided to expand my cluster, so I bought 5 more Raspberry Pi 4s, this time with 2 GB of RAM, as well as a 24-port network switch and an extremely long ethernet cable, using money I got from New Year gifts. With the help of my dad, I ran the Ethernet cable from our router to my room. On the router, I installed OpenWrt, a custom firmware for routers, and used its VLAN features to isolate my servers from the rest of the network and protect them from potential attacks related to WiFi and IoT devices. My dad also helped me construct a glass display case, which I used to house all my servers.
The extra power from the new Raspberry Pis allowed me to host even more services, but eventually, I ran into issues where certain services were designed to run on x86 CPUs such as those made by Intel and AMD, while Raspberry Pis use ARM CPUs like the ones often used in mobile devices. Around this time, I built myself a new desktop computer, which meant I no longer needed to use my 2012 Mac Mini with a broken drive. I replaced the old HDD with an SSD, upgraded the RAM to 16 GB, and added it to my cluster.
The addition of an x86 system with lots of RAM and lots of fast storage allowed me to host services that previously would've been difficult or impossible to host. The Mac Mini ran services such as [Nextcloud](https://github.com/nextcloud/server), [OnlyOffice](https://github.com/ONLYOFFICE/server), and [Gitea](https://github.com/go-gitea/gitea) to replace Google Drive, Google Docs, and Github respectively. These services required lots of RAM and storage, so they could not be run on my Raspberry Pis. It also ran services such as [Invidious](https://invidious.io/), which is written in the Crystal programming language, which couldn't be compiled for ARM at the time.
However, as my cluster continued to grow, I found it increasingly difficult to manage and keep track of all the devices and the various services they were hosting. The typical solution to this problem is to use an orchestration tool like Kubernetes, however, that wasn't an option for me since some of my nodes only have 1 GB of RAM, and even the most lightweight Kubernetes clients consume a substantial amount of it.
Eventually, while looking for a completely different tool, my friend found and showed me a tool called [Nomad](https://nomadproject.io). I looked at its documentation and discovered that it was much more lightweight than Kubernetes because it omitted features that were unnecessary for many clusters. I successfully set up a Nomad server on my Mac Mini and effortlessly added my other servers as clients. With minimal setup, everything was up and running. While exploring Nomad, I discovered two additional powerful tools, Consul and Traefik. Consul keeps track of the services running and their metadata, while Traefik acts as a reverse proxy, dynamically communicating with Consul to discover services. I incorporated these tools into my Nomad cluster, enhancing its functionality.
I quickly began converting the services I was running to Nomad services, fascinated by the extremely high level of automation. I was able to simply write a configuration file describing the service I wanted to run, and Nomad would automatically figure out which server would be optimal for that service, download it on that server, configure it, run it, and then manage it while it's running. Then, Nomad sent the metadata to Consul, from where Traefik was able to pick it up and automatically reconfigure itself to proxy requests to that service and acquire a TLS certificate.
I was impressed by Nomad's resilience, as I discovered when I intentionally shut down one of the servers running various services. Within seconds, Nomad stepped in and rescheduled the services on a different server, downloading and configuring them, and updating the Consul data. This prompted Traefik to reconfigure itself, and in no time, everything was back up and running seamlessly with no intervention from me.
Over time, I added more nodes, and every time, Nomad was able to add them to the cluster seamlessly and began scheduling tasks on them. I now run lots of services, including [SearXNG](https://docs.searxng.org/), [Matrix Dendrite](https://matrix.org/docs/projects/server/dendrite), [Homer](https://github.com/bastienwirtz/homer/), [Woodpecker CI](https://woodpecker-ci.org/), [MinIO](https://min.io/), [CyberChef](https://github.com/gchq/CyberChef), [Gitea](https://gitea.io), and much more. In fact, this site is hosted on that cluster. It's stored in a git repo on my Gitea server. When I update it, Woodpecker rebuilds it, and then uploads it to MinIO, from where Nomad downloads and deploys it.
I release all the files I use for my services publicly at https://gitea.arsenm.dev/Arsen6331/nomad (mirrored to https://github.com/Arsen6331/nomad)
Beyond the benefits of increased privacy, this experience has been a valuable learning opportunity for me. I have gained knowledge that I otherwise would not have had the chance to acquire. I highly recommend that anyone with an interest in self-hosting should give it a try. You don't have to go for a complex setup like mine to experience the benefits of self-hosting. Many people self-host their services on an old laptop that they no longer use.