Category Archives: Tips & Tricks

Compacting an Outlook PST file

Migrating to a new computer recently, I observed that my Outlook PST file  (sitting in C:\Users\Amrinder\AppData\Local\Microsoft\Outlook folder) was about 4 GB.  While not necessarily a problem, the folder size of Personal Folders itself was only showing up to be about 2.4 GB.  (You can right click on Personal Folders -> Properties for Personal Folders -> Folder Size).

Outlook PST Folder Size

 

So, this begs the question, if Personal Folders is 2.4 GB, why is the PST file 4 GB?  The answer to that lies in the structure of the PST file.  Essentially, PST is a format that can be very easily loaded, very easily saved and very easily searched (gigabytes of information that we  load when we start up Outlook, and is searchable in about 5 seconds).  As the students and practitioners of data structures and algorithms will no doubt note, this requires that data files be indexed on many different columns, and then those index files stored, along with the data files.  Further, whenever index files are created, they contain pointers back to data files themselves, and then when the data files are changed (for example, an email deleted, etc), data file cannot be simply modified without modifying those index files as well.  So, that leads to two choices: every time you delete an email (or any element), you reprocess your index files.  This would simply lead to a very non-responsive program.  Other option is to let some of the changes go pending, and then run this maintenance once in a while, in which you shrink the data files, reclaim the deleted blocks and reindex the files.

Interestingly, you can run a small test for yourself.  First, note the size of Personal Folders and the size of the PST file. Then, send a 10 MB file as an attachment to someone, and note your Folder size and the size of the PST file.  Then, delete the email from Sent Items and check the folder size and the size of the PST file again.  You will likely observe that while the size of the Personal Folders went from x to x+10 to x, the PST file only went from y to y+10+delta, and didn’t really go down to y after you deleted the email.  So, read on.

This aspect of dirty records is not limited to Outlook only.  Relational database management systems (such as Oracle) essentially do the same – they let delete records run without index recomputations, and then recompute the indexes (and shrink the data files) in a periodic fashion.  In Oracle, you can trigger off this “data cleansing/recomputation” manually by using the analyze table queries.  In Outlook too, you can do this by: Personal Folders -> Properties for Personal Folders -> Advanced -> Compact Now.

Outlook PST Compact Now

 

[Now, if you continue your test, you may observe that the PST file does go back after the file deletion and data compaction.]

One thing that I did observe, that even after doing the data compaction (really, there is such a verb), the file did not reach the same size as the Personal Folders was suggesting, but after shutting down Outlook and restarting it, and running data compaction one more time, it got to about 20% of the folder size.

And the winner of SJ launcher contest is..

OK, there was no such contest, but I used the 3 launchers: Launchy, Keylink and SlickRun for the about one week each as I mentioned here, and here is my summary.

  Launcher Score
SlickRun 95
Launchy 94
Keylink 60

Single winner is SlickRun. My only previous issue with SlickRun was that it was showing up in taskbar, and when I would Alt Tab, it would show up in the list of programs. The launcher was getting in the way. That turned out to be just a default setting. Simply click on Options in Setup, and check the box “AutoHide SlickRun” and that goes away.

Launchy is a close second – my only gripe with it was that sometimes it was taking too much CPU, but that may not be a general problem.

As a direct comparison between Launchy and SlickRun – Launchy starts off by building a neat catalog, but SlickRun is easier to use in terms of defining a new alias (magic word). From my perspective, SlickRun is a winner in the longer term as you can build your aliases based on what you use more often.

Keylink is a distant third – I don’t recommend it.

Launchy, Keylink and other slick things

Windows 7 includes a much smoother program launch start bar, compared to its predecessors.  Just click the Win key on the keyboard and start typing, and the program or the document that you are thinking of shows up.  Very slick.

Oil Slick, courtesy NASA Goddard Photo and Video

Very slick, but obviously not slick enough, when I still continue to use Launchy – the keystroke launcher program, and when many of my friends continue to use SlickRun and Keylink(This post isn’t about comparing these launchers, it is just to show their value compared to solutions inbuilt in OS.  My informal comparison of these 3 has been added as a comment.) Launchers used to be absolutely critical with the previous versions of Windows, and although 7 has a great start bar, there are some ways in which custom launchers still hold the edge.  Since launching a program or a document is an activity that you do hundreds of times a day, even a half a second of saving in that is sufficient to consider a specialized program.  So, what are Launchy’s advantages compared to Windows 7 inbuilt launcher?

Primarily, it is the speed.  Launchy is just the launch bar, minus the start menu.  So, it shows up faster.  Also, it looks for only the programs (although you can add other things to its search catalog), so the search speed is faster.

Then, there is the issue of command line arguments.  In many launchers, you can start typing “Firefox” or “Google” and then type the search phrase, and that will launch the browser, and search for the given phrase.  This would save you more than half a second compared to a native solution, but this is also a slightly lesser used scenario, since most of us have a browser open most of the time.

Then there is this small matter: Launchy’s box comes up in the center of the screen.  Windows Start menu’s start box is at the bottom left corner at the bottom of a large (and distracting) search menu.  As the screens are becoming larger, this is a slightly larger issue.  When we want to launch a program, the launch process should be as small an interruption as possible.

[One slight modification I make to Launchy right after installing it: I change the hot key to Ctrl Alt Space, instead of Alt Space, since I frequently use Alt Space C to close programs.]

Log Viewers, Tails, Chainsaws and 97 Other Reasons Developers Fight with Managers

When I worked with IBM for a navy project, my boss introduced me to the beauty of Chainsaw – a log viewer that ships with log4j.   In so many words “It is called Chainsaw, because it cuts logs to size”, he said with a smirk.  Tech managers have bad sense of humor, but anyway I didn’t want to tell him that I didn’t know what log4j meant as that would have made me sound dumb.  So, I picked up using log4j, and right from get go, really liked it.  Coupled with log4j, chainsaw is a big productivity booster, although you can also use chainsaw with JDK logging.

Logging and log viewers can have a significant impact on developers’ productivity, so the item #7 that I usually cover in the Top 10 Activities that Affect our Productivity is usually about log viewers.  There are of course, a few alternatives to chainsaw, but tail+grep isn’t one of them.  Tail+grep combination is used so frequently simply because developers don’t like being told they something can’t be done using grep.  Developers are, generally speaking, gritty people.  They are there because they like challenges, are knowledgeable and may be opinionated.  Best way of bonding with them is by ranting off against evil companies.  It is sometimes difficult to teach them something new, because hey, they have this really cool other software that not only does what you want it to do, it also makes great foam while playing chess during its spare cycles.  And not onllogy that, you can actually do anything with it.

The problem isn’t really with developers –  it is broader; developers get caught in this solely due to bloggers blogging about developers.  At a more basic level, this problem exists in any form of marketing and is succinctly known as the advertising rule of 7.  Whether the empirical number 7 is correct or incorrect, the notion that sales happens after multiple touch points is hardly debatable.

So, if you have never heard about chainsaw before, you can start counting this as the first touch point. 6 more to go, and I think you will start using it.

[Update: See Scott’s comment below and checkout the awesome new version out.]

Word is now almost 70% of LaTeX

Way back in 1997, my independent study assignment for Prof. S K Gupta at IITD was going hopelessly fast. I had 3 months of graph theory experience (and was still struggling with chromatic polynomial while having clear intentions of solving Ulam’s reconstruction conjecture before the end of the year), and being a compsci junior, I had not yet studied two advanced classes in theoretical computer science. But that was all fine. The troubling part was – I didn’t know LaTeX.  My typesetting skills were limited to a the “cutting edge” tool – Microsoft Word, from Office 97 (where Clippy makes an appearance).  Prof. Gupta suggested we make a journal article out of the work, and that involved LaTeX.  So, not creative output or graph theoretic discoveries, but it was LaTeX that was the bridge that I had to cross to get to the “published” side?

Manhattan Bridge, Photo courtesy See-Ming LeeOf course LaTeX is not WYSIWYG.  You type in a text editor, and the output is highly polished professional typeset document (for example, in PDF).  How TeX does that is really magic, and that magic is the reason TeX is still the mainstay of scientific publishing.  That said, the gap between Word and LaTeX is narrowing, to the point that I am ready to claim that Word 2007 is an almost 70% feature set of LaTeX.  The single biggest gap that still remains is that LaTeX handles floats (pictures/tables) beautifully.  Word does not.  In Word, you put a picture where you need it, but in LaTeX you put a picture “somewhere here”.  The “somewhere here” is pretty powerful, because as your article goes through versions, it continues to move the picture somewhere close by to adjust.   This is a big deal breaker for an article that is more than 3 pages long.  People do use word for long articles, but a majority of them do so because they have no choice, and no time to learn LaTeX.

There were two other deal breakers with previous versions of Word – the references and the citations.

Let us talk about references.  Your document has sections with headings, like Section 1. Introduction, Section 2. Problem Statement, etc.  (You know how to get headings to have automatic numbering, right?)  Suppose you are saying in the conclusions – “As we discussed in Section 3.1, ….”, the “3.1” is a reference.  You don’t want to actually type in 3.1, as that number may become 4.1 or 3.2 if you add another section or subsection before that.  The solution is to use references, by adding a reference (using References -> Captions -> Cross-Reference from the ribbon), and then adding a reference to heading number.  This adds a logical reference to that section, much like using a \ref and \label combination does in LaTeX.  Alt-INR is a handy keyboard shortcut, which is from Word 2003, but works in Word 2007 as well.

This however is only part of the story.  Suppose, you added a logical reference, and now as chance would have it, you did end up adding another section in the beginning.  You quickly navigate back to the conclusions to see if it says “Section 4.1”, just like magic.  But there is no magic – it still says “Section 3.1”.  What an otter wastage of time!

Otter, courtesy Mike Baird

So what went wrong?  Well, for one, you didn’t say abra ca dabra.  Secondly, whether in LaTex or in Word, you do need to tell the typesetter to “prepare” once in a while.  In LaTeX you do so by “compiling” the document.  In Word, you do this by using short-cut key F9 (refresh) on that field.  For a long document, you can just select everything (Ctrl-A) and then press F9.  So, in other words Ctrl-A F9 is the Word’s equivalent of LaTeX’s compilation.  The combination of these two things – firstly adding logical references instead of hard coded references and then using Ctrl-A F9 makes us much more efficient typesetters and authors.

The citation management in Word 2007 has improved, but there are still gaps and some tips and tricks, let us talk about them another time, ok?