Digital assistant language teacher

Since a couple of months we are trying harder to learn a foreign language.

And as we excepted it is very hard to get a proper grasp on speaking the language. Especially since it is a very different language to our mother tongue.

And while comfortably interacting with digital assistants around the house every day in english and german the thought came up: why don’t these digital assistants help with foreign language listening and speaking training?

I mean Google Assistant answers questions in the language you have asked them. Siri and Alexa need to know upfront in which language you are going to ask questions. But at least Alexa can translate between languages…

But with all seriousness: Why do we not already have the obvious killer feature delivered? Everyone could already have a personal language training partner…

a virtual network inside your machine

Did you ever start a horde of virtual machines and a complicated vm-only network set-up just to simulate a medium complex network and the interaction of nodes in that network? Well that’s a tiresome, error-prone and labour intensive process. Fear no more, there’s a tool to the rescue.

“Mininet creates a realistic virtual network, running real kernel, switch and application code, on a single machine (VM, cloud or native), in seconds, with a single command:”

frontpage_diagram

“Because you can easily interact with your network using the Mininet CLI (and API), customize it, share it with others, or deploy it on real hardware, Mininet is useful for development, teaching, and research. Mininet is also a great way to develop, share, and experiment with OpenFlow and Software-Defined Networking systems.

Mininet is actively developed and supported, and is released under a permissive BSD Open Source license. We encourage contribution of code, bug reports/fixes, documentation, and anything else that can improve the system!”

Source: http://mininet.github.com/

Security Engineering — The Book

The second edition of the book “Security Engineering” by Ross Anderson is available as a full download. It’s quite a reference and a must-read for anybody with an interest in security (which for example all developers should have).

“When I wrote the first edition, we put the chapters online free after four years and found that this boosted sales of the paper edition. People would find a useful chapter online and then buy the book to have it as a reference. Wiley and I agreed to do the same with the second edition, and now, four years after publication, I am putting all the chapters online for free. Enjoy them – and I hope you’ll buy the paper version to have as a conveient shelf reference.”
book2coversmall

Source 1: http://www.cl.cam.ac.uk/~rja14/book.html

know your numbers!

Wikipedia describes latency this way:

“Latency is a measure of time delay experienced in a system, the precise definition of which depends on the system and the time being measured. In communications, the lower limit of latency is determined by the medium being used for communications. In reliable two-way communication systems, latency limits the maximum rate that information can be transmitted, as there is often a limit on the amount of information that is “in-flight” at any one moment. In the field of human-machine interaction, perceptible latency has a strong effect on user satisfaction and usability.” (Wikipedia)

Given that it’s quite important for any developer to know his numbers. Since latency has a huge impact on how software should be architected it’s important to keep that in mind:

 

Bildschirmfoto 2012-12-25 um 21.28.20

 

Source: http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html

Build a Brain – SPAUN

SPAUN or Semantic Pointer Architecture Unified Network is a promising next step in the pursuit to simulate a human brain. Built upon the Nengo Neural Simulator scientists at the University in Waterloo/Ontario were able to report on their first break-through results.

In 2013 there will be a book from Oxford University press called ‘How to build a brain’ which will describe in depth what made the astonishing results possible.

But what are the results?

Well that looks like number recognition. In fact that’s what it is. SPAUN – that’s how the scientists refer to their frankenstein-brain – is capable of solving 8 different tasks now. One of them is number recognition. There are videos of all 8 tasks being performed.

The Semantic Pointers are named after the pointers usually common in computer science:

“Higher-level cognitive functions in biological systems are made possible by semantic pointers. Semantic pointers are neural representations that carry partial semantic content and are composable into the representational structures necessary to support complex cognition.

The term ‘semantic pointer’ was chosen because the representations in the architecture are like ‘pointers’ in computer science (insofar as they can be ‘dereferenced’ to access large amounts of information which they do not directly carry). However, they are ‘semantic’ (unlike pointers in computer science) because these representations capture relations in a semantic vector space in virtue of their distances to one another, as typically envisaged by connectionists. “

Source 1: http://nengo.ca/build-a-brain
Source 2: http://nengo.ca/build-a-brain/spaunvideos/

 

practical filesystem design

In November 1998 there was a book released about file system design taking the Be File System as the central example.

“This is the new guide to the design and implementation of file systems in general, and the Be File System (BFS) in particular. This book covers all topics related to file systems, going into considerable depth where traditional operating systems books often stop. Advanced topics are covered in detail such as journaling, attributes, indexing and query processing. Built from scratch as a modern 64 bit, journaled file system, BFS is the primary file system for the Be Operating System (BeOS), which was designed for high performance multimedia applications.

You do not have to be a kernel architect or file system engineer to use Practical File System Design. Neither do you have to be a BeOS developer or user. Only basic knowledge of C is required. If you have ever wondered about how file systems work, how to implement one, or want to learn more about the Be File System, this book is all you will need.”

If you’re interested in the matter I definitely recommend reading it – it’s available for free in PDF format and will help to understand what those file system patterns are all about – even in terms of things we still haven’t gotten from our ‘modern filesystems’ today.

Source 1: http://www.nobius.org/~dbg/

second Tokyo Trip 2012 – Rakuten Technology Conference 2012

This October I had the pleasure to fly to Tokyo for the second time in 2012.

The development unit of Rakuten Japan was hosting the 7th Rakuten Technology Conference in Rakuten Tower 1 in Tokyo.

The schedule was packed with up to 6 tracks in parallel. From research to grass-roots-development a lot of interesting topics.

Source 1: http://tech.rakuten.co.jp/rtc2012/
Source 2: Recorded Lectures

open source audio codecs getting better

Some weeks ago I heard about a new audio codec which is being developed as open source – very similar to vorbis – the previous open source approach to audio codecs.

This time it seems that they’ve got some standardization into the play so it might be more successful than vorbis was.

“Opus is a totally open, royalty-free, highly versatile audio codec. Opus is unmatched for interactive speech and music transmission over the Internet, but also intended for storage and streaming applications. It is standardized by the Internet Engineering Task Force (IETF) as RFC 6716 which incorporated technology from Skype’s SILK codec and Xiph.Org’s CELT codec.”

Source 1: http://www.opus-codec.org/
Source 2: http://auphonic.com/blog/2012/09/26/opus-revolutionary-open-audio-codec-podcasts-and-internet-audio/
Source 3: http://tools.ietf.org/html/rfc6716

Photosynth now mobile…

It’s been some months years since the once Microsoft Research Project got public and Microsoft started offering it’s great Photosynth service to the public.

I’ve been using the Microsoft panoramic and Photosynth tools for years now and I tend to say that they are the best tools one can get to create fast, easy and high-quality panoramic images.

There is photosynth.net to store all those panoramic pictures like this one from 2008:

The photosynth technology itself contains several other interesting technologies like SeaDragon which allows high quality image zooming on current internet connection speeds.

This awesome technology is as of now available on the iPhone (3GS and upwards) and it’s better than all the other panoramic tools I’ve used on a phone.

the process of taking the images

after the pictures are taken additional stitching is needed

after the stitching completed a fairly impressive panoramic images is the result

Source 1: Photosynth articles from the past
Source 2: Photosynth in Wikipedia
Source 3: Photosynth on iPhone App Store

benchmarking the sones GraphDB (on Mono (sgen) and .NET)

Since we’re at it – we not only took the new Mono garbage collector through it’s paces regarding linear scaling but we also made some interesting measurements when it comes to query performance on the two .NET platform alternatives.

The same data was used as in the last article about the Mono GC. It’s basically a set of 200.000 nodes which hold between 15 to 25 edges to instances of another type of nodes. One INSERT operation means that the starting node and all edges + connected nodes are inserted at once.

We did not use any bulk loading optimizations – we just fed the sones GraphDB with the INSERT queries. We tested on two platforms – on Windows x64 we used the Microsoft .NET Framework and on Linux x64 we used a current Mono 2.7 build which soon will be replaced by the 2.8 release.

After the import was done we started the benchmarking runs. Every run was given a specified time to complete it’s job. The number of queries that were executed within this time window was logged. Each run utilized 10 simultaneously querying clients. Each client executed randomly generated queries with pre-specified complexity.

The Import

Not surprisingly both platforms are almost head-to-head in average import times. While Mono starts way faster than .NET the .NET platform is faster at the end with a larger dataset. We also measured the ram consumption on each platform and it turns out that while Mono takes 17 kbyte per complex insert operation on average the Microsoft .NET Framework only seems to take 11 kbyte per complex insert operation.

The Benchmark

Let the charts speak for themselves first:

mononet

click to enlarge

benchmark-mono-sgen
click on the picture to enlarge

benchmark-dotnet
click on the picture to enlarge

As you can see on both platforms the sones GraphDB is able to work through more than 2.000 queries per second on average. For the longest running benchmark (1800 seconds) with all the data imported .NET allows us to answer 2.339 queries per second while Mono allows us to answer 1.980 queries per second.

The Conclusion

With the new generational garbage collector Mono surely made a great leap forward. It’s impressive to see the progress the Mono team was able to make in the last months regarding performance and memory consumption. We’re already considering Mono an important part of our platform strategy – this new garbage collector and benchmark results are showing us that it’s the right thing to do!

UPDATE: There was a mishap in the “import objects per second” row of the above table.