Backfilling is a term that if you think about its meaning, you have to wonder how or why it became associated with content management. In the construction world, you dig a hole, you put some well-designed stuff (foundation components, drainage, electrical, plumbing, etc.) in it, and then you dump the dirt you dug out back in. Usually, there are one or more inspections to determine if the hole was deep enough and if the stuff was up to code, but nobody is all that interested in the material you are going to dump back in, a.k.a. the backfill, unless you’re building a runway. In the ECM world, we build containers with the express desire to backfill them with valuable material.
Backfilling done right – I have written several posts about the solution we built to manage our loss control inspection reports and related content. We created workflows to glean information from the activity, and we designed and built a management dashboard to report that activity so people could better manage the process.
A current inspection proceeds from the planning stage to distribution, and as the properties of the various documents change, workflows set metadata which drive reporting metrics. When we were building this library, we knew that we had several decades worth of prior inspection reports that our engineers would eventually want to be able to reference. We realized that those reports didn’t fit the process.
A person tasked with backfilling this library is not expected to research each report and determine when it passed through each status point. Even if they could, we don’t care about those metrics. We aren’t trying to manage past activity or retired engineers, nor are we concerned with historic missed deadlines. On the other hand, some of those old reports contain valuable insights, some contain great examples of communicating difficult concepts and some contain the original seed of a current topic of interest. In other words, the backfill contains some stuff that we care about and a bunch of stuff that we don’t care about. We created a workflow to support backfilling guided by those facts. The people pushing the stuff back in the hole are required to fill-in some metadata, and the workflow generates harmless values for the metadata that we don’t care about. By harmless, I mean values that aren’t going to distort those precious metrics.
Backfilling ignored – Recently, I built a solution consisting of three related custom lists, to track recommendations made in those loss control reports. My understanding of this process was that a recommendation would be entered, updates would be made and comments would be collected. Not surprisingly, those are the three lists I created. What I didn’t realize is that sometimes updates take on a life of their own. As someone is crawling through the list of active recommendations, the last three updates might seem much more interesting than the original recommendation and / or the first two updates. Suddenly, a process that was designed to move linearly from ‘A-B-C-D-E’ is starting at ‘C’, and going straight to…well, let’s just say it doesn’t end well.
The fix was easy. If an update is entered without an original entry, a workflow will now create the stub of an original entry. The entry of ‘C’ will cause ‘A-lite’ to be created, along with a task to remind people to complete ‘A’. There is no analogy in the construction world. If a contractor begins backfilling before the well-designed stuff is installed, or even before the work in inspected, the backfill has to be removed and the process has to start over.
Some of the more important lessons that emerged from these and other examples include:
Storage solutions have to be designed to hold all the content that is going to be uploaded, not just today’s version of that content. Note: a future post will include a story about this.
Metadata has to include values that represent “this happened so long ago that we don’t know” or “this used to be” or “we don’t really care about this in this case” and other such intelligent NULL values.
Workflows may have to be constructed to facilitate backfilling. These are disposable workflows, but investing in them will result in much more useful information.
Dashboards have to be designed to ignore the intelligent NULL values as well as some of the metadata that can be estimated by workflows. For example, we add historic inspection reports with all dates set to the date of the inspection. We can drive the “inspections per year” metric from those values, but the “average time between inspection and report distribution” would be distorted.
If we build our sites like contractors build foundations, we focus on the wonderful well-behaved content management solution that will rise from our hard work. If we ignore the stuff that is piled up at the edge of our virtual site, we are also ignoring an opportunity to make our solution better and someone’s job easier.