Monthly Archives: April 2008

Denormalization – Database and XML documents

Denormalizing is a process in which hierarchical data is flattened, and data elements common to the records are copied across those records.

Denormalizing a database
The term “Denormalization” is usually used in the context of a database. For example, consider a database consisting of “Project” and “Phase” tables. Let us assume that there exist two projects, each with 4 phases.

In a normalized structure, Phase table has an FK into the Project table, and the information for project (such as the project name) is stored in the project table (the parent table), and is referred from the Phase table (the child table). It kinda looks like this:

Project table Phase table
ID Name
1 New OS
2 New Browser


Project ID Phase ID Phase Name
1 1.a Market Research
1 1.b Design
2 2.1 Design
….Other phases…


In a denormalized, or a flat structure, both tables can be morphed into one table, say ProjectPhase, and that has information about both Projects and Phases. In such a structure, the project name would be copied at each record corresponding to each phase of the project.

Project ID Project Name Phase ID Phase Name
1 New OS 1.a Market Research
1 New OS 1.b Design
2 New Browser 2.1 Design
….Other phases…

Denormalizing an XML document using XSLT

Ok, now that we have a lengthy discussion on denormalization behind us, let us look at XML denormalization. XML denormalization works, the same way, and hierarchy XML structure can be denormalized into a “flatter” XML document.

Consider this hierarchical (normalized) XML file:

<project project_id=”2″>
<name>Brand New Browser</name>
<phase id=”2.A”>
<desc>Firefox Cleanup</desc>
<task priority=”1″><resource>M</resource></task>
<task priority=”2″><resource>M</resource></task>
<phase id=”2.B”>
<task priority=”1″><resource>B007</resource></task>
<task priority=”2″><resource>SC</resource></task>

Consider this corresponding denormalized XML file:

<projectname>Brand New Browser</projectname>
<phasedesc>Firefox Cleanup</phasedesc>
<projectname>Brand New Browser</projectname>
<phasedesc>Firefox Cleanup</phasedesc>


This denormalization can be done very easily using an XSLT. The following XSLT conveys some concepts used in this:

<?xml version = “1.0” encoding = “UTF-8”?>
<xsl:transform xmlns:xsl = “” version = “1.0”>
<xsl:template match = “AllProjects”>
<xsl:apply-templates select = “project/phase/task”/>
<xsl:template match = “task”>
<xsl:element name = “task”>
<projectid><xsl:value-of select=”../../@project_id”/></projectid>
<projectname><xsl:value-of select=”../../name”/></projectname>
<phaseid><xsl:value-of select=”../@id”/></phaseid>
<phasedesc><xsl:value-of select=”../desc”/></phasedesc>
<priority><xsl:value-of select=”@priority”/></priority>
<resource><xsl:value-of select=”resource”/></resource>

"I work hard but I can’t give you an estimate"

Every company has the hard working guy who simply wont commit to a deadline. Every single time the conversation kind of goes like this:

– “Nope, have no idea when I will be done”. (shrugging, shaking his head)
– “A week?”
– “Could be. Can’t really say.”
– “Can it be 3 weeks?”
– “Sure, if I run into some unexpected problem!”
– “So no chance you can be done this week”
– “Again, cant say, if I get lucky, we could be done”

Needless to say, people like this are dreadful to manage. They throw everyone else’s schedule in a spin, and do not allow any planning to happen.

These kinds of people typically fall in two categories:
(i) Have been burnt by being held accountable and do not want to be burnt again
(ii) Do not have any point wise plan, so cannot really estimate how long a task would take

Depending upon which category your guy falls in, you will need to take appropriate route. It is much easier if your problem guy falls in second category: you can just help him develop a plan and have him estimate the smaller pieces. If, your guy is the first category, then you can offer some tech therapy, but results may be rather mixed.