out with the old, in with the new – house gets ssd upgrade

A week ago I had written about another mechanical hard drive that was about to bite the dust in our houses elaborate set-up.

Not having time for a full-day-of-focus I postponed the upgrade to this saturday. With the agreement of the family as they are suffering through the maintenance period as well.

The upgrade would need cautious preparation in order to be doable in one sitting. And this was also meant to be some sort of disaster-recovery-drill. I would restore the house central docker and service infrastructure from scratch along this.

And this would need to happen:

  • all services, zfs pools, docker containers, configurations needed to be double checked for full backup – as this would be used to restore all (ZFS snapshots are just the bomb for these things!)
  • the main central docker server would have to go down
  • get all hard disks ripped out
  • SSDs put in and properly configured
  • get a fresh Ubuntu 18.04 LTS set-up and booting from ZFS on a NVMe SSD (bios update(s)!, secure boot disabling, ahci enabling, m.2 instead of sata express switching…you get the idea)
  • get the network set-up in order: upgrading from Ubuntu 16.04 to 18.04 means ifupdown networking was replaced by netplan. Hurray! Not.
  • get docker-ce and docker-compose ready and set-up and all these funky networkings aligned – figure out in this that there are major issues with IPv6 in docker currently.
  • pull in the small number of still needed mechanical hard disks and import the ZFS pools
  • start the docker builds from the backup (one script \o/)
  • start the docker containers in their required order (one script \o/)

Apart from some hardware/bios related issues and the rather unexpected netplan introduction everything went fairly good. It just takes ages to see data copied.

the “heartbeat” is a general term in our house for busy everything is. It’s an artificial value calculated from sensor inputs/s and actions taken and so on. Good indication if there are issues. During the time of maintenance (organge/red) it hasn’t been updated and was stuck at the pre-given value.

Bandwidth was the only real issue with this disaster recovery. All building blocks seemed to fall into place and no unplanned measure had to be taken. The house systems went partially down at around 12:30 and were back up 10 hours later 22:00. Of course non-automated things like internet kept working and all switches were only manual push-buttons. So everything could be done still but with a lot less convenience.

All in all there are more than 40 vital docker container based services that get started one after the other and interconnect to deliver a full house home automation. With the added SSD performance this whole ship is much much more responsive to activities. And hopefully less prone to mechanical defects.

Backup and Disaster-Preparations showed to be practical and working well. There was no beat missed (except sensor measure values during the 10 hours downtime) and no data lost.

Core i3 with 3.7 Ghz and 32 Gbyte RAM is sufficient and tuned for power consumption

What could be done better: It could be much more straight forward when there were less dependencies on external repositories / docker-hub. Almost all issues that came up with containers where from the fact that the maintainers had just a day before introduced something that kept them from spinning up naturally. Bad luck. But that can be helped! There’s now a multi-page disaster-recovery-procedure document that will be used and updated in the future.

Oh and what speeds am I seeing? The promissed 3 Gbyte/s read and write speeds are real. It’s quite impressive to see 4-digit megabyte/s values in iotop frequently.

I almost forgot! During this exercise I had been in the server room less than 30 minutes. But I was on a warm and nice work-desk set-up I am using in the house as much as I can – and I will tell you about it in another article. But the major feature of this work-desk set-up is that it is (a) a standing desk and (b) has a treadmill under it. Yes. Treadmill.

You will get pictures of the set-up in that mentioned article, but since I had spent more than 10 hours walking on saturday doing the disaster recovery I want to give you a glimpse of what such a set-up means:

46 km while doing disaster recovery successfully.

indoor location tracking with ESP32

This project uses the same approach that I took for my ESP32 based indoor location tracking system (by tracking BLE signal strength). But this project came up with an actual user interface – NICE!

“Indoor positioning of a moving iBeacon, using trilateration and three ESP32 development modules. ESP32 modules report all beacons they see, to MQTT topic. Dashboard subscribes to this topic, and shows the location of beacons which are seen by all three stations.”
(https://github.com/jarkko-hautakorpi/iBeacon-indoor-positioning-demo)

hard drive reliability stats 2018

Backblaze is a company that offers cloud storage space and therefore operates a large amount of storage arrays.

In their own words:

As of December 31, 2018, we had 106,919 spinning hard drives. 

https://www.backblaze.com/blog/hard-drive-stats-for-2018/

This large amount of spinning disks means that there are also failing drives that stop spinning once every while. Backblaze saw the need to take note about what hard-drive series fails more of less often and started to generate a yearly report on the reliability of these hard drives.

Yesterday they published their report for 2018 – if you got storage requirements or if you are in the market to purchase storage space for your operation – it probably is very helpful to take a look at the report.

Apple Airplay for SONOS (in Docker)

We’ve got a couple of SONOS based multi-room-audio zones in our house and with the newest generation of SONOS speakers you can get Apple Airplay. Fancy!

But the older hardware does not support Apple Airplay due to it’s limiting hardware. This is too bad.

So once again Docker and OpenSource + Reverse-Engineering come to the rescue.

AirConnect is a small but fancy tool that bridges SONOS and Chromecast to Airplay effortlessly. Just start and be done.

It works a treat and all of a sudden all those SONOS zones become Airplay devices.

There is also a nice dockerized version that I am using.

waking up to another dying hard disk – upgrade time!

At our house I am running a medium-sized operation when it comes to all the storage and in-house / home-automation needs of the family.

This is done by utilizing several products from QNAP, Synology and a custom built server infrastructure that does most of the heavy-lifting using Docker.

This morning I woke up to an eMail stating that one of the mirrored drives in the machine is reporting read-errors.

Since this drive is part of a larger array of spinning-rust style hard disks just replacing it would work but due to the life-time of those drives I am not particularly interested in more replacing in the very near future. So a more general approach seems right.

63083 lifetime hours = 2628 days = 7.2 years powered up

You can see what I mean. This drive is old. Very old. And so are its mates. Actually this is the newest drive of another 6 or so 1.5TB and 1TB drives in this array.

Since this redundant array in fact is still quite small and not fully used as most storage intensive non service-related disk space demands have moved to iSCSI and other means it’s not the case anymore that so many disks, so well redundant with so little disk space are needed anymore. Actual current space utilization seems about 20% of the available 2TB volume.

Time for an upgrade! Taking a look in the manual of the mainboard I had replaced 2 years ago I found that this mainboard does have dual NVMe m.2 ports. From which I can boot according to that same manual.

So I thought: Let’s start with replacing the boot drives and the /var/lib docker portions with something fast.

To my surprise Samsung is building 1 TB NVMe M.2 SSDs to a price I expected to be much higher.

Nice! So let me reeport back when this shipped and I can start the re-set-up of the operating system and docker environment. Which by all fairness should be straight forward. I will upgrade from Ubuntu 16.04 LTS to 18.04 LTS in the same step – and the only more complex things I expect to happen is the boot-from-ZFS(on Linux) and iSCSI set-up of the machine.

If you got any tips or best-practice, let me know.

I just have started the catch-up on what happpened in the last 2 years to ZFS on Linux. My initial decision to use Linux 2 years ago as the main driver OS and Ubuntu as the distribution was based upon the exepectation to not have this as my hobby in the next years. And that expectation was fulfilled by Ubuntu 16.04 LTS.

japanese festival calendar

Last year I had started to create a calendar that would hold all the events and festivals (まつり / matsuri) in Japan – especially Tokyo – I can get ahold of.

Since it has become a custom in my family to spend several weeks several times a year in the Tokyo area this calendar is used and updated frequently.

Of course it is a calendar you can export, import and subscribe to with any iCal / ICS capable device at your disposal. And probably that means any device that has a calendar app or a browser.

You can click this link and subscribe through google calendar: japanese matsuri calendar

#10yearchallenge – Twitter edition

I am seeing a lot of people doing the”10 year challenge”: posting two pictures side by side – one from today and the other from ten years ago.

Some say it might be a planted meme to train AIs the effect of human aging… What an interesting hypothesis!

Funny enough I did not take part. And sure enough I am getting all these nudges by services like Twitter… So. Ten years ago this blog existed and I started using Twitter. Apparently I will still use this website but changes a bit my general approach to services like Twitter.

And the next time I’ll explain why I am always awake quite early :-)

small and cheap multi-sensor nodes for home automation

I had reported on my efforts to develop an indoor location tracking system previously. Back in 2017 when I started to work on this I only planned to utilize inexpensive EspressIf ESP32 SoCs to look for bluetooth beacons.

In the time between I figured that I could, and should, also utilize the multiple digital and analog input/output pins this specific SoC offers. And what better to utilize it with then a range of sensors that also now could feed their measurements into an MQTT feed along with the bluetooth details.

And there is a whole lot of sensors that I’ve added. On a breadboard it looks like this:

So what do we have here:

  • Motion sensor
  • Temperature sensor
  • Humidity sensor
  • Light sensor
  • Barometric pressure sensor
  • and of course an RGB LED to show a status

The software I’ve done already and after 3 weeks of extensive testing it seems that it’s stable. I will release this eventually later in the process.

I’ve also found plastic cases that fill fit this amount of sensory over the sensor cases I had already bought for the ESP32 alone. For now I’ll close this article with some pictures.

The MQTT feed one of these nodes produces…

…and the Grafana dashboard I am using for this specific prototype device.

progressive web applications

These days even heise online is writing up about the wonders of PWA (progressive web applications).

PWA simply put is a standardized way to add some context to websites and package them up so they behave as much like a native mobile application. A mobile application that you are used to install onto your phone or tablet most likely using an app store of some sort.

The aim of PWA is to provide a framework and tooling so that the website is able to provide features like push notifications, background updates, offline modes and so on.

Very neat. I’ve just today have enabled the PWA mode of this website, so you’re now free to add it to your home screen. But fear not: You won’t be pestered with push notifications or any background stuff taking place. It’s merely a more convenient optional shortcut.

Instagram – until now

I’ve had already added a couple of pictures to my instagram account – mainly while abroad. Pictures that I consider nice enough to be shared.

Of course my latest switch away from those public silos will include having those pictures posted mainly on this website and maybe as a side-note on those services as well.

To begin with I will have a separate page created that will host those pictures I consider nice enough to be shared.

[ngg src=”galleries” ids=”5″ display=”basic_thumbnail”]

blog maintenance – status

A bit of feedback is in on the plan to revitalize this blog. Thanks for that!

I have spent some more time this weekend on getting everything a bit tidied up.

There is the archive of >3.000 posts that I plan to review and re-categorize.

There is the big number of comments that had been made in the past and that I need to come up with a plan on how to allow/disallow/deal with comments and discussions in general on this website.

There is also the design and template aspects of this website. I switched to a different template and started to adjust it so that it shall make access to the stream of posts as easy as possible. Until then you need to wait or contact me through other means. But contacting is another post for another time.

resilvering …

The last Ubuntu kernel update seemingly kicked two hard disks out of a ZFS raidz – sigh. With ZFS on Linux this poses an issue:

Two hard drives that previously where in this ZFS pool named “storagepool” where reassigned a completely different device-id by Linux. So /dev/sdd became /dev/sdf and so on.

ZFS uses a specific metadata structure to encode information about that hard drive and it’s relationship to storage pools. When Linux reassigned a name to the hard drive apparently some things got shaken up in ZFS’ internal structures and mappings.

The solution was these steps

  • export the ZFS storage pool (=taking it offline for access/turning it off)
  • use the zpool functionality “labelclear” to clear off the data partition table of the hard drives that got “unavailable” to the storage pool
  • import the ZFS storage pool back in (=taking it online for access)
  • using the replace functionality of zpool to replace the old drive name with the new drive name.

After poking around for about 2 hours the above strategy made the storage pool to start rebuilding (resilvering in ZFS speak). Well – learning something every day.

4+ hours to go.

Bonus: I was not immediately informed of the DEGRADED state of the storage pool. That needs to change. A simple command now is run by cron-tab every hour.

zpool status -x | grep state: | tr –delete state: |mosquitto_pub -t house/stappenbach/server/poppyseeds/zpool -l

This pushes the ZFS storage pool state to MQTT and get’s worked on by a small NodeRed flow.

taking the social stream back to this blog

I am currently in the process of reducing my presence on the usual social networks. Here is my reasoning and how I will do it.

Facebook, Twitter, Instagram and alike are seemingly at the peak of their popularity and more and more users get more and more concerned about how their data and privacy is handled by those social networks. So am I.

Now my main concern is not so much on the privacy side. I never published anything on a social network – private or public – that I would not be published or freely distributed/leak. But:

I have published content with the intention that it would be accessible to everyone now and in the future. The increasing risk is that those publishing platforms are going to fade away and thus will render the content I had published there inaccessible.

My preferred way of publishing content and making sure that it stays accessible is this website – my personal blog.

I am doing this since 2004. The exact year that Facebook was founded. And apparently this website and it’s content has a good chance of being available longer than the biggest social network at the present time.

So what does this mean? 3 basic implications:

  1. I will become a “lurker” on the social networks. Now and in the future.
  2. All comments and reactions I will make will be either directly in private or through my personal website publicly available and linkable.
  3. I will minimize my footprint on the social networks as much as possible. This for example means: If I use Twitter, all tweets will live 7 days and automatically be removed after this time. Deleting your tweets automatically is something others do as well.

As you can see: This is not about a cut or abstinence. I get information out of social networks, tweet message flows. But I do not put any trust in the longevity of both the platforms and the content published there.

The next steps for me will be a complete overaul of this website. Get everything up to current standards to streamline my publishing process.

Expect a lot of content and change – and: welcome to my blog!