Posts


Aug. 17, 2020

Do your sensors yourself

A big question I’ve asked myself during this project is what is the best place to put my storage servers? There are multiple environmental variables to watch out: temperature, humidity and noise. If components are too hot, they could be damaged in the long run. Of course, water and electricity are not friends. You can add a fan to move air out of the case and reduce both temperature and humidity but the computer will become noisy. We need to measure those variables. Unfortunately, all systems have different set of built-in sensors but not all of them are exposed to the operating system. So I decided to build my own sensors.

Aug. 14, 2020

Power consumption and failures prevention

Providing a full storage service means having computers up 24x7. On one hand, if we power off the local storage server when we aren’t using it, we’ll have to find a solution to respect the backup policy and synchronize with remote servers that could be down at the moment. On the other hand, if we let the storage server up all the time, it will consume unnecessary resources and throw money down the drain. I deeply know that a personal computer, which is idle most of the time, doesn’t consume so much power. This is my conviction. But how to verify it?

Aug. 10, 2020

Increased observability with the TIG stack

Observability has become a buzzword lately. I must admit, this is one of the many reasons why I use it in the title. In reality, this article will talk about fetching measurements and creating beautiful graphs to feel like detective Derrick, an old and wise detective solving cases by encouraging criminals to confess by themselves.

With the recent Go programming language gain of popularity, we have seen a lot of new software coming into the database world: CockroachDB, TiDB, Vitess, etc. Among them, the TIG stack (Telegraf, InfluxDB and Grafana) has become a reference to gather and display metrics.

Aug. 7, 2020

Problem detection and alerting

Everything is distributed, automated and runs in perfect harmony with a common goal: protect your data. But bad things happen, and rarely when you expect them. This is why you need to watch for services states and send a notification when something goes wrong. Monitoring systems are well-known in the enterprise world. For our use case, we don’t need to deploy a complex infrastructure to check couple of hosts. For this reason, I choose to use the good old Nagios Core. It even provides a web interface for humans like us.

Aug. 3, 2020

Geographic distribution with Sanoid and Syncoid

Failures happen at multiple levels: a single disk can fail, as well as multiple disks, a single server, multiple servers, a geographic region, a country, the world, the universe. The probability decreases with the number of simultaneous events. Costs and complexity increase with the number of failure events you want to handle. It’s up to you to find the right balance between all those variables.

For my own infrastructure at home, I was able to put storage servers into three different locations. Two in Belgium (with 10Km distance from one another), one in France. They all share the same data. Up to two storage servers can burn or be flooded entirely without data loss. There are different redundant solutions at the host level but I will not cover them in this article.

Jul. 31, 2020

State of Internet bandwidth in Belgium

I was born and raised in a little city next to Paris in France. In early 2000s, the unlimited “high-speed” Internet access revolutionized communications. No need to monopolize the phone line with a 56Kbps modem anymore. Since then, the bandwidth has always increased. We have seen the ADSL, ADSL2 and fiber technologies. We had something called “Triple play” offers where unlimited phone calls, TV and Internet were packed together. There were three major companies on the market: France Telecom/Orange, Bouygues and Neuf/Cegetel/SFR (depending on the year). Then Free jumped into that alliance and broke prices with revolutionary offers. From this time, all French ISP have “low prices” – between 30 and 50€/month – for “high-speed” – hundreds of Mbps for both down and up – thanks to the fiber deployment.

Jul. 27, 2020

Network configuration with OpenVPN

Networking is hard. Dealing with ISP modem settings is even harder. Mine doesn’t have a static public IP address by default. If the modem reboots, it is likely that it will be assigned a new one. For regular people, it is not a problem for browsing the Internet. But for hackers like us, that means we cannot use the IP address itself to reach the private network from the outside world. It becomes a problem when we try to join hosts in different networks.

Jul. 24, 2020

Hardware adventures and operating systems installation

At the beginning of the project, the goal was to create a single storage server at my apartment. So I bought a fancy case with racks in front to hot replace disks and I retrieved an Intel NUC motherboard from work. It had only two SATA ports available to connect disks which is not enough to plug at least four disks: one for the system and three for the storage. I bought a PCI RAID card to add four slots. I connected two small SSD for the system and four data disks, then installed FreeBSD without any issue. I started to copy data to the storage space when a noisy alarm1 began to wake everybody up in the building. This was unbearable. I decided to buy a micro ATX motherboard with processor and memory to replace the Intel NUC board. Wrong. I confused micro ATX with mini ITX formats. The first one was too big to fit in the box. So I bought a classic ATX case with a cheap power supply and 3x2TB disks from work. Storage1 was born.

Jul. 20, 2020

Infrastructure overview

The idea behind this infrastructure is to run on commodity servers. No need to buy big racks of expensive servers as we see in data centers. Simple homemade computers will do the job. At work, I have access to cheap hard drives that were used in servers and either are out of warranty or not suitable for enterprise workload. They generally are half their market price. I have a mix of brand new and re-used drives to reduce the risk of having two disks failing at the same time in the same host.

Jul. 17, 2020

Storage servers at home

I was born in the 90s. I grew up with computers. Other generations call us “digital natives”. I am lucky and proud to work with computers every day with a database specialization. People tend to generate lots of data. It might be administrative papers (bills, contracts, paychecks), sentimental photo albums or whatever the data is as long as it is their data. At work, we pay attention to back up every data though it was the most important thing in the world. At home, it should be the same but, in fact, nobody really cares about it unless the data is definitely gone.