PoptrayU: Updated Screenshot

Here’s an updated screenshot with my progress over the past few days:

You’ll notice now that the two emails with subjects in Hebrew and Greek (probably horribly translated, I just used Google Translate) now appear correctly in the main window.

Getting these messages to appear correctly was actually quite tricky to figure out. The key, in the end, was that I had to make a copy of the subject field before the Indy routine ProcessHeaders mangled it into Ansi characters, and then call my DecodeHeader algorithm on the un-mangled header.

The ironic thing is, if I upgraded to the latest version of Delphi, and the latest version of Indy (as in Indy10 rather than as in Indy9.0.53), the whole DecodeHeader and handling international characters would be a non-issue. But then I’d have a whole different can of worms–I’d have to fix all the non-backward compatible changes between Indy 9 and 10, I’d have to figure out how to convert ActionBand Popups into the new Delphi 2010 equivalent. In the end, what it would take to port the app to a newer version of Delphi is probably more difficult than coercing the old version of Delphi and Indy into doing what I want it to do. Sure, in the end it might end up being more robust that way, but I might also break and mangle lots of existing features.

At this point, what you see in the screenshot, I am only processing the Subject field with the new technique that works for any codepage (vs: only the current one or UTF-8), so I need to extrapolate my strategy to the other header fields that might be encoded. I also need to find an equivalent strategy to do the same thing on the Preview window. In the preview window it re-downloads the email through a different Indy code-path, and I haven’t found the right place in that code-path where I can capture the un-mangled header yet. So there’s still work to do but I’m on the right path.

To convert the random code-page to Unicode, I am using the windows library function WideCharToMultiByte, which converts a string to a “wide” (Unicode) string based on a specified code page number. Getting the code page number was also a little bit of a challenge. The library with that function doesn’t have a GetCodePageNumber function to convert the code page *name* to the windows code page *number*. There is a DLL that comes with windows that has (almost) that function, but figuring out how to call it is kind of tricky, and rumors on the internet say it might be buggy in certain cases. So, I’m using the straightforward but ugly strategy: convert the table on MSDN of allowed code-page names/IDs to a data structure and look it up manually. That list isn’t the full IANA supported list of encodings (aliases/alternate names), but the cases it doesn’t handle are likely to be rare, and could be added in the future if future research doesn’t find a better strategy.

I started out storing the list in a record (ie: Delphi equivalent of a struct) with a dead simple sequential search algorithm, per a tip on StackOverflow, but I wasn’t happy with the performance on lookup, because there’s 140 different encodings in my list of encodings so far, and this method is going to be called in a loop for each header of each email unless it’s definitely not encoded. So I did some research and found TStringList which can be used like a sorted map with better string search performance, and THashedStringList, which is basically a hash-map data structure, so even better string search performance–up to the level you need for, get this, INI FILES! Then I had to do more research to figure out what the Delphi equivalent of a static initializer is to use the Hashed String List…But now that I have it working, it does seem noticeably faster even on a small number of emails, but the internet could just be less congested today, it’s hard to say.