Equivalent conductance due to air infiltrations

Here is how you compute the thermal coupling between a room’s indoor temperature and the outdoor temperature due to infiltrations.

The equation governing the exchange of heat is as follows:

Cair × dT/dt = g × Δ T

where Cair is the indoor air thermal mass [J/K], T is the indoor air temperature [K], g is the equivalent conductance [W/K] and Δ T is the temperature difference between indoors and outdoors.

What we need to compute is how much heat is exchanged per unit of time when air seeps into the indoor room. The air that infiltrates will draw (or more rarely, yield) heat from the indoor air until it reaches thermal equilibrium with T. So we obtain:

V’ × ρ × Cp air × Δ T = J

where V’ is the infiltration rate [m3 / s], ρ is the air density (about 1.2 kg/m3), Cp air is the air thermal capacity (about 1008 J/kgK) and J is the rate of heat exchange in W.

Now we immediately recognize that actually, J / Δ T = g. But we can also simplify further by noting that air infiltration rates are more commonly given in room volumes renewed by some unit of time (commonly the hour). Note first that:

Cair = Vair × ρ × Cp air

and therefore:

V’ × Cair / Vair = g

or more simply, if Nren is the room volume renewal rate per hour:

Nren / 3600 × Cair = g

And therefore the main equation becomes simply:

dT/dt = Nren / 3600 × Δ T

I find this to be an amazingly simple equation, especially as it applies to other situations similar to air infiltrations. For example, you can use it to compute the exchange of heat among the elements of a heater through which hot water circulates.

Event rate of arrival analysis with R

Here is a very common problem: suppose you’re give a series of event timestamps. The events can be anything—website logins, persons entering a building, anything that recurs regularly in time but whose rate of arrival is not known in advance. Here is, for example, such a file which I had to analyze:

05.02.2010 09:00:18
05.02.2010 09:00:18
05.02.2010 09:00:21
05.02.2010 09:00:23
05.02.2010 09:00:24
05.02.2010 09:00:29
05.02.2010 09:00:29
05.02.2010 09:00:30
05.02.2010 09:00:35

and so on for several thousand lines. Your task is to anlyze this data and to derive “interesting” statistics from it. How do you do that?

My initial reaction was, hey let’s try to derive the rate of arrival per second over time. Histograms are one way of doing this, except that histograms are known for the dramatically different results they can yield for different choices of bin width and position. So instead of histograms, I tried doing this with so-called density plots.

That, it turns out, was a terrible idea. I think I must have spent a day and a half figuring out how to use R’s density function and its arguments, especially the bandwidth parameter. There are two problems with density: 1) it yields a density, which means that you have to scale it if you want to obtain a rate of events; 2) it’s an interpolation of a sum of kernels, which has the unfortunate side-effect of yielding a curve whose integral is not necessarily unity.

In the morning of the second day, I realized I had been solving the wrong problem. I’m not really interested in knowing the rate of arrival. What I really need to know is how many items is my system handling simultaneously. Think about that for a second. If your data represent the visitors to your website, you really don’t want to know how many visitors come per second if you don’t know how much time the server needs to serve each one. In other words, if you want to make sure the server never melts down, you need to know how many users are served concurrently by the server, and to know that you also need to know how much time is needed for each request.

Or again, if you’re designing a building or a space to which people are supposed to come, be served somehow, and then leave, you really don’t need to know how many people will come per hour; you need to know how many people will be in the building at the same time.

There is something called Little’s Law which states this a bit more formally. Assuming the system can serve the requests (or people, or jobs) without any pileup, then N=t× Λ, where N is the number of requests being served concurrently, t is the time spent on each request, and Λ is the request rate. Now it should be obvious that if you know t, the data will give you N from which you can derive Λ (if you want).

Here’s how I did it in R. Suppose the data comes in a zipped data.zip file, with timestamps formatted as above. Then:

library(lattice) # always, always, always use this library
# Get raw data as a vector of DateTime objects
data <- as.POSIXct(scan(unzip("data.zip"), what=character(0), sep="\n"), format="%d.%m.%Y %T") # Turn it into a dataframe which will be easier to use data.lt <- as.POSIXlt(data) data.df <- data.frame(time=data, sec=jitter(data.lt$sec, amount=.5), min=data.lt$min, hour=data.lt$hour) data.df$timeofday <- with(data.df, sec+60*min+3600*hour) rm(data, data.lt) Note that for this example we'll assume all events happened on the same day. Now here's the idea. We're going to build a counter that counts +1 for each event and -1 when that event has been served. In R, we can do that with the cumsum function. For example, suppose we have a series of ten events spaced apart according to a Poisson distribution with mean 4:

> x <- cumsum(rpois(10,4)) > x
[1] 6 9 14 20 26 33 37 38 38 44

Suppose each event takes 6 seconds to serve, and build a structure holding x, the coordinates of the original events and of their completion times, and y, the running counter of the number of events being served:

> temp <- list(x=c(x, x+kLatency),y=c(rep(1,length(x)), rep(-1, length(x)))) > reorder <- order(temp$x) > temp$x <- temp$x[reorder] > temp$y <- cumsum(temp$y[reorder]) If you now plot the temp structure here is what you would get:

With all this in place, we have now everything we need to go ahead. The little script above can go into its own function or it can be defined as xyplot‘s panel argument:

kLatency = 4 # seconds
xyplot(1~timeofday,
data.df,
main = paste(“Concurrent requests assuming”, kLatency, “seconds latency”),
xlab = “Time of day”,
ylab = “# concurrent requests”,
panel = function(x, darg, …) {
temp <- list(x = c(x, x + kLatency), y = c(rep(1,length(x)), rep(-1, length(x)))) reorder <- order(temp$x) temp$x <- temp$x[reorder] temp$y <- cumsum(temp$y[reorder]) panel.lines(temp, type="s") }, scales = list(x = list(at = seq(0, 86400, 7200), labels=c(0,"","", 6,"","", 12,"","", 18,"","",24)), y = list(limits = c(-1, 20))) ) Unfortunately I cannot show you the results here, but this analysis showed me immediately when and how often the webserver would be under its most heavy load, and directly informed our infrastructure needs.

Book review: Agile Project Management with Scrum

I began reading Ken Schwaber’s ‘Agile Project Management with Scrum’ for two reasons: 1) it’s a book about Scrum, and 2) it’s from Ken Schwaber, one of the fathers of Scrum. Having now read it, I think these are the only reasons I don’t entirely regret reading it.

The book is a series of case studies, bases on real-world experiences Schwaber has had managing projects. Each case study shows how a particular aspect of Scrum was applied, adapted, or tweaked on a real project. Schwaber devotes each chapter to a distinct aspect of Scrum, e.g. the ScrumMaster’s role, the project backlog, the sprint planning, etc.

As such, the book is clearly intended for experienced ScrumMasters, which I am not. But I think an experienced ScrumMasters reading this book will find it lacking in depth. There is almost too much material in this book, covered too shallowly. Each chapter in this book might easily provide enough material for a separate book, or a workshop, or a detailed whitepaper. Even with my limited knowledge and experience of Scrum I felt frustrated by the lack of detail Schwaber gave in the book. For instance, he never shows us a real-world example of a product backlog. Instead of showing us a snapshot of a real sprint backlog taped to a team’s room, the book shows us a nicely formatted table with very obviously watered-down entries.

In short, I think an experienced ScrumMaster will find this book lacking in detail, whereas the Scrum student will find it hard to relate to the case studies. As such I find it hard to recommend this book, and I think anyone interested in Scrum should rather consult Schwaber’s earlier book, ‘Agile Software Development with Scrum’, or Henrik Kniberg’s excellent ‘Scrum and XP from the Trenches’, also available for free from here.