Filling in the blanks with Power Query

Inside of the statue

A couple of weeks ago I posted on using PowerQuery to visualize my many Xfinity outages. The completist in me couldn’t leave the data along with missing days (essentially the days when my monitoring didn’t log an outage).

This post explains how to add them back in. It’s really pretty simple, using some more advanced Power Query functions. The essential algorithm is:

To do that, we need to add a new column to the first dataset, the day of the outage, essentially truncating the start time: Add column For simplicity I’m going to ignore outages that span days.

One Power Query can reference another. So the easiest way forward is to create a new query that implements the algorithm above. ┬áHere’s the full code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
let
  Earliest = List.Min(temp[OutageDay]),
  Latest = List.Max(temp[OutageDay]),
  Difference = Duration.TotalDays(Latest-Earliest),
  InterveningDays = List.Dates(Earliest,Difference,#duration(1,0,0,0)),
  ZeroInterveningDays = List.Difference(InterveningDays,temp[OutageDay]), 
  AsTable = Table.FromList(ZeroInterveningDays,Splitter.SplitByNothing(), {"OutageDay"}),
  WithZero = Table.AddColumn(AsTable,"DownDuration", each 0, Int32.Type),
  SetType = Table.TransformColumnTypes(WithZero, {{"OutageDay", type date}}),
  AddStartTime = Table.TransformColumnTypes(Table.AddColumn(SetType, "StartTime", each DateTime.From([OutageDay])), {{"StartTime", type datetime}}),
  AddEndTime = Table.AddColumn(AddStartTime, "EndTime", each [StartTime]),
  Joined = Table.Combine({temp,AddEndTime})
in
  Joined

The first two lines calculate the earliest and latest outage days in the dataset ’temp’. That way, when the underlying text file changes so too will our dataset.

Next we generate a list of days between the earliest and latest using List.Dates. That function generates a list of dates and requires a start date, a number of days to list and the gap between each date (#duration(1,0,0,0) essentially means 1 day).

List.Difference is like a set minus operation. Given one list (all the days), take away another list (outage days) and return the days that there was not an outage.

From then on, it’s just a case of making the list look like our existing table:

Switch Excel chart to the new table and we finally get a chart with 0 day outages:

Chart

If you’ve not tried Power Query, I really encourage you to give it a try. While it has its eccentricities it really is a very powerful tool for manipulating data.